wait events2

122
1 More Examples of Interpreting Wait Events To Boost System Performance Roger Schrag and Terry Sutton Database Specialists, Inc. www.dbspecialists.com

Upload: ragin-chittaliya

Post on 05-Jan-2016

213 views

Category:

Documents


0 download

DESCRIPTION

oracle wait events

TRANSCRIPT

Page 1: Wait Events2

1

More Examples of Interpreting Wait Events To Boost System Performance

Roger Schrag and Terry Sutton

Database Specialists, Inc.www.dbspecialists.com

Page 2: Wait Events2

2

Session Objectives

Briefly introduce wait events:– Define wait events– Discuss how to use the wait event interface

Walk through five examples of how wait event information was used to troubleshoot production problems

Page 3: Wait Events2

3

“Wait Event” Defined We say an Oracle process is “busy” when it wants

CPU time. When an Oracle process is not busy, it is waiting

for something to happen. There are only so many things an Oracle process

could be waiting for, and the kernel developers at Oracle have attached names to them all.

These are wait events.

Page 4: Wait Events2

4

Wait Event Examples An Oracle process waiting for the client

application to submit a SQL statement waits on a “SQL*Net message from client” event.

An Oracle process waiting on another session to release a row-level lock waits on an “enqueue” event.

Page 5: Wait Events2

5

Wait Event Interface Each Oracle process identifies the event it is

waiting for each time a wait begins. The instance collects cumulative statistics

about events waited upon since instance startup.

You can access this information through v$ views and a wait event tracing facility.

These make up the wait event interface.

Page 6: Wait Events2

6

Viewing Wait Events

http://dbrx.dbspecialists.com/pls/dbrx/view_report

Page 7: Wait Events2

7

Why Wait Event Information Is Useful

Wait events touch all areas of Oracle—from I/O to latches to parallelism to network traffic.

Wait event data can be remarkably detailed. “Waited 0.02 seconds to read 8 blocks from file 42 starting at block 18042.”

Analyzing wait event data will yield a path toward a solution for almost any problem.

Page 8: Wait Events2

8

Important Wait Events There were 158 wait events in Oracle 8.0. There are 363 wait events in Oracle 9i Release 2

(9.2.0). Most come up infrequently or are rarely significant

for troubleshooting performance. Different wait events are significant in different

environments, depending on which Oracle features have been deployed.

Page 9: Wait Events2

9

A Few Common Events

buffer busy waits library cache load lock control file parallel write library cache pin control file sequential read log buffer space db file parallel read / write log file parallel write db file scattered read log file sequential read db file sequential read log file switch completion direct path read / write log file sync enqueue undo segment extension free buffer waits write complete waits latch free

Page 10: Wait Events2

10

Idle Events

Sometimes an Oracle process is not busy simply because it has nothing to do.

In this case the process will be waiting on an event that we call an “idle event.”

Idle events are usually not interesting from the tuning and troubleshooting perspective.

Page 11: Wait Events2

11

Common Idle Eventsclient message PX Deq: Execute Replydispatcher timer PX Deq: Execution Msggcs for action PX Deq: Signal ACKgcs remote message PX Deq: Table Q Normalges remote message PX Deque waiti/o slave wait PX Idle Waitjobq slave wait queue messageslock manager wait for remote message rdbms ipc messagenull event slave waitparallel query dequeue smon timerpipe get SQL*Net message from clientPL/SQL lock timer SQL*Net message to clientpmon timer SQL*Net more data from clientPX Deq Credit: need buffer virtual circuit statusPX Deq Credit: send blkd wakeup time manager

Page 12: Wait Events2

12

Accounted for by the Wait Event Interface

Time spent waiting for something to do (idle events)

Time spent waiting for something to happen so that work may continue (non-idle events)

Page 13: Wait Events2

13

Not Accounted for by the Wait Event Interface

Time spent using a CPU Time spent waiting for a CPU Time spent waiting for virtual memory to be

swapped back into physical memory Time spent on CPU-intensive activities:

– Logical reads– Spinning while waiting for latches– Statement parsing

Page 14: Wait Events2

14

Timed StatisticsThe wait event interface will not collect timing information unless timed statistics are enabled.

Enable timed statistics dynamically at the instance or session level:ALTER SYSTEM SET timed_statistics = TRUE;

ALTER SESSION SET timed_statistics = TRUE;

Enable timed statistics at instance startup by setting the instance parameter:timed_statistics = true

Page 15: Wait Events2

15

The Wait Event Interface

Dynamic performance views– v$system_event– v$session_event– v$event_name– v$session_wait

Wait event tracing

Page 16: Wait Events2

16

The v$system_event ViewShows one row for each wait event name, along with cumulative statistics since instance startup. Wait events that have not occurred at least once since instance startup do not appear in this view.

Column Name Data Type-------------------------- ------------EVENT VARCHAR2(64)TOTAL_WAITS NUMBERTOTAL_TIMEOUTS NUMBERTIME_WAITED NUMBERAVERAGE_WAIT NUMBERTIME_WAITED_MICRO NUMBER

Page 17: Wait Events2

17

Columns In v$system_event EVENT: The name of a wait event TOTAL_WAITS: Total number of times a process has

waited for this event since instance startup TOTAL_TIMEOUTS: Total number of timeouts while

waiting for this event since instance startup TIME_WAITED: Total time waited for this wait event by

all processes since startup (in centiseconds) AVERAGE_WAIT: The average length of a wait for this

event since instance startup (in centiseconds) TIME_WAITED_MICRO: Same as TIME_WAITED but

in microseconds (Oracle 9i)

Page 18: Wait Events2

18

Sample v$system_event Query

SQL> SELECT event, time_waited 2 FROM v$system_event 3 WHERE event IN ('smon timer', 4 'SQL*Net message from client', 5 'db file sequential read', 6 'log file parallel write');

EVENT TIME_WAITED--------------------------------- -----------log file parallel write 159692db file sequential read 28657smon timer 130673837SQL*Net message from client 16528989

Page 19: Wait Events2

19

The v$session_event ViewShows one row for each wait event name within each session, along with cumulative statistics since session start.

Column Name Data Type-------------------------- ------------SID NUMBEREVENT VARCHAR2(64)TOTAL_WAITS NUMBERTOTAL_TIMEOUTS NUMBERTIME_WAITED NUMBERAVERAGE_WAIT NUMBERMAX_WAIT NUMBERTIME_WAITED_MICRO NUMBER

Page 20: Wait Events2

20

Columns In v$session_event SID: The ID of a session (from v$session) EVENT: The name of a wait event TOTAL_WAITS: Total number of times this session has waited

for this event TOTAL_TIMEOUTS: Total number of timeouts while this

session has waited for this event TIME_WAITED: Total time waited for this event by this session

(in centiseconds) AVERAGE_WAIT: The average length of a wait for this event in

this session (in centiseconds) MAX_WAIT: The maximum amount of time the session had to

wait for this event (in centiseconds)

Page 21: Wait Events2

21

Sample v$session_event Query

SQL> SELECT event, total_waits, time_waited_micro 2 FROM v$session_event 3 WHERE SID = 4 (SELECT sid FROM v$session 5 WHERE audsid = 6 USERENV ('sessionid') ); EVENT WAITS TIME_WAITED_MICRO--------------------------- ----- -----------------db file sequential read 552 2409173db file scattered read 41 315928SQL*Net message to client 73 347SQL*Net message from client 72 3397382712

Page 22: Wait Events2

22

Oracle 9i Bug #2429929 SQL> SELECT event, total_waits, time_waited_micro 2 FROM v$session_event 3 WHERE SID + 1 = 4 (SELECT sid FROM v$session 5 WHERE audsid = 6 USERENV ('sessionid') ); EVENT WAITS TIME_WAITED_MICRO--------------------------- ----- -----------------db file sequential read 552 2409173db file scattered read 41 315928SQL*Net message to client 73 347SQL*Net message from client 72 3397382712

Page 23: Wait Events2

23

The v$event_name View

Shows one row for each wait event name known to the Oracle kernel, along with names of up to three parameters associated with the wait event.

Column Name Data Type-------------------------- ------------EVENT# NUMBERNAME VARCHAR2(64)PARAMETER1 VARCHAR2(64)PARAMETER2 VARCHAR2(64)PARAMETER3 VARCHAR2(64)

Page 24: Wait Events2

24

Columns In v$event_name

EVENT#: An internal ID NAME: The name of a wait event PARAMETERn: The name of a parameter

associated with the wait event

Page 25: Wait Events2

25

Sample v$event_name Query

SQL> SELECT * 2 FROM v$event_name 3 WHERE name = 'db file scattered read';  EVENT# NAME---------- ------------------------------PARAMETER1 PARAMETER2 PARAMETER3------------- ------------- ------------- 95 db file scattered readfile# block# blocks

Page 26: Wait Events2

26

The v$session_wait ViewShows one row for each session, providing detailed information about the current or most recent wait event.

Column Name Data Type-------------------------- ------------SID NUMBERSEQ# NUMBEREVENT VARCHAR2(64)P1TEXT VARCHAR2(64)P1 NUMBERP1RAW RAW(4)P2TEXT VARCHAR2(64)P2 NUMBERP2RAW RAW(4)P3TEXT VARCHAR2(64)P3 NUMBERP3RAW RAW(4)WAIT_TIME NUMBERSECONDS_IN_WAIT NUMBERSTATE VARCHAR2(19)

Page 27: Wait Events2

27

Columns In v$session_wait SID: The ID of a session SEQ#: A number that increments by one on each

new wait STATE: An indicator of the session status:

– ‘WAITING’: The session is currently waiting, and details of the wait event are provided.

– ‘WAITED KNOWN TIME’: The session is not waiting, but information about the most recent wait is provided.

– ‘WAITED SHORT TIME’ or ‘WAITED UNKNOWN TIME’: The session is not waiting, but partial information about the most recent wait is provided.

Page 28: Wait Events2

28

Columns In v$session_wait (cont.)

EVENT: The name of a wait event PnTEXT: The name of a parameter associated with the

wait event Pn: The value of the parameter in decimal form PnRAW: The value of the parameter in raw form WAIT_TIME: Length of most recent wait (in

centiseconds) if STATE = ‘WAITED KNOWN TIME’ SECONDS_IN_WAIT: How long current wait has been

so far if STATE = ‘WAITING’

Page 29: Wait Events2

29

Sample v$session_wait Query

SQL> SELECT * FROM v$session_wait WHERE sid = 16;  SID SEQ# EVENT---- ----- ------------------------------P1TEXT P1 P1RAW P2TEXT P2 P2RAW------ ---- -------- ------ ---- --------P3TEXT P3 P3RAW WAIT_TIME SECONDS_IN_WAIT------ ---- -------- --------- ---------------STATE------------------- 16 303 db file scattered readfile# 17 00000011 block# 2721 00000AA1blocks 8 00000008 -1 0WAITED SHORT TIME

Page 30: Wait Events2

30

Tracing Wait Event Activity

Methods for setting debug events: ALTER SESSION SET events oradebug dbms_system.set_ev

Using the dbms_support package or setting debug event 10046 enables SQL trace, and can optionally include wait event information and bind variable data in trace files as well.

Page 31: Wait Events2

31

Activating Wait Event Tracing

dbms_support is missing from many releases of Oracle 8i, but is available as a patch.

dbms_support is not installed by default; run dbmssupp.sql in ?/rdbms/admin to install it.

dbms_system.set_ev is not supported by Oracle Corporation because it lets you set any debug event and some can put your database at risk.

Tracing imposes serious system overhead, so trace only what you need.

Page 32: Wait Events2

32

Debug Event 10046 Settings

ALTER SESSION SET events

'10046 trace name context forever, level N';

Value of N Effect

1 Enables ordinary SQL trace

4 Enables SQL trace with bind variable values included in trace file

8 Enables SQL trace with wait event information included in trace file

12 Equivalent of level 4 and level 8 together

Page 33: Wait Events2

33

Sample Oracle 8i Trace Output=====================

PARSING IN CURSOR #1 len=80 dep=0 uid=502 oct=3 lid=502 tim=2293771931 hv=2293373707 ad='511dca20'SELECT /*+ FULL */ SUM (LENGTH(notes))FROM customer_callsWHERE status = :xEND OF STMTPARSE #1:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=0,tim=2293771931BINDS #1: bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 size=24 offset=0 bfp=09717724 bln=22 avl=02 flg=05 value=43EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=2293771931WAIT #1: nam='SQL*Net message to client' ela= 0 p1=675562835 p2=1 p3=0WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=923 p3=8WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=931 p3=8WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=939 p3=8WAIT #1: nam='db file sequential read' ela= 0 p1=17 p2=947 p3=1WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1657 p3=8

Page 34: Wait Events2

34

Wait Event Tracing Enhancements In Oracle 9i

The dbms_support package is provided for easier trace activation.

Elapsed times in the trace file are shown in microseconds instead of centiseconds.

A “waits=yes” option has been added to TKPROF to include wait event statistics in the TKPROF report.

Page 35: Wait Events2

35

Using Wait Event Information

Five examples of how wait event information was used to diagnose production problems

Page 36: Wait Events2

36

Example #1: Buffer Busy Waits

A magazine publisher has a website that displays content stored in a database. At times the website would get bogged down—response time would become poor and the database server would become extremely busy (near-zero idle time).

Page 37: Wait Events2

37

Viewing Wait Events Statistics With Statspack

Collect Statspack snapshots at regular intervals.

Statspack report shows top wait events for entire instance during snapshot interval.

Oracle 9i Statspack also shows CPU time used during the interval.

Page 38: Wait Events2

38

Statspack Report Output Snap Id Snap Time Sessions

------- ------------------ --------

Begin Snap: 61 11-Dec-02 13:00:52 145

End Snap: 71 11-Dec-02 14:00:26 145

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Top 5 Wait Events

~~~~~~~~~~~~~~~~~ Wait % Total

Event Waits Time (cs) Wt Time

------------------------ ---------- ---------- -------

buffer busy waits 1,962,372 1,278,649 50.03

db file sequential read 1,336,870 1,050,878 41.12

db file scattered read 47,717 49,326 1.93

direct path write 8,070 40,574 1.59

latch free 38,220 31,012 1.21

Page 39: Wait Events2

39

What We See in the Statspack Report

Dominant wait events:– buffer busy waits– db file sequential read

Over 23,000 seconds of wait time on these two events in a one hour period (over 6 seconds of waiting per elapsed second)

Page 40: Wait Events2

40

Understanding the Buffer Busy Waits Event

SQL> SELECT parameter1, parameter2, parameter3 2 FROM v$event_name 3 WHERE name = 'buffer busy waits'; PARAMETER1 PARAMETER2 PARAMETER3------------ ------------ ------------file# block# id

file#: Data file containing the desired data block block#: Block within the data file that is desired id: Reason the buffer in the buffer cache is

busy (see Metalink bulletin #34405.1)

Page 41: Wait Events2

41

Finding Which Data Blocks Are Experiencing Buffer

Contention SQL> SELECT sid, event, state, seconds_in_wait, 2 wait_time, p1, p2, p3 3 FROM v$session_wait 4 WHERE event = 'buffer busy waits' 5 ORDER BY sid; SID EVENT STATE SEC TIME P1 P2 P3--- ----------------- ----- --- ---- ----- ----- ----- 12 buffer busy waits WAITE 1 0 30 62157 130 31 buffer busy waits WAITE 1 0 30 23558 130

Page 42: Wait Events2

42

Finding Which Data Blocks Are Experiencing Buffer

ContentionSQL> SELECT owner, segment_name, segment_type 2 FROM dba_extents 3 WHERE file_id = &absolute_file_number 4 AND &block_number BETWEEN block_id 5 AND block_id + blocks -1;

Enter value for absolute_file_number: 30Enter value for block_number: 62157

OWNER SEGMENT_NAME SEGMENT_TYPE----------------- ------------------- ------------ PRODMGR SAMPLES TABLE

Page 43: Wait Events2

43

Reason Codes from Metalink Bulletin #34405.1

P3 Reason Code

110 We want the CURRENT block either shared orexclusive but the Block is being read intocache by another session, so we have to waituntil their read() is completed.

130 Block is being read by another session and noother suitable block image was found, so wewait until the read is completed.

220 During buffer lookup for a CURRENT copy ofa buffer we have found the buffer but someoneholds it in an incompatible mode so we have towait.

Page 44: Wait Events2

44

What We Have Learned So Far

A buffer containing a data block of the SAMPLES table is experiencing contention.

The buffer in the buffer cache is busy because another session is reading the same data block from disk.

Page 45: Wait Events2

45

Understanding the DB File Sequential Read Event

SQL> SELECT parameter1, parameter2, parameter3 2 FROM v$event_name 3 WHERE name = 'db file sequential read'; PARAMETER1 PARAMETER2 PARAMETER3------------ ------------ ------------file# block# blocks

file#: Data file containing the desired data block block#: Block within the data file that is desired blocks: How many blocks are being read

(typically 1 for db file sequential read)

Page 46: Wait Events2

46

Finding Which Data Blocks Are Being Read

SQL> SELECT sid, event, state, seconds_in_wait, 2 wait_time, p1, p2, p3 3 FROM v$session_wait 4 WHERE event = 'db file sequential read' 5 ORDER BY sid;

SID EVENT STATE SEC TIME P1 P2 P3--- ----------------- ----- --- ---- ----- ----- ----- 17 db file sequentia WAITE 1 0 30 62042 1 19 db file sequentia WAITE 1 0 30 61731 1 33 db file sequentia WAITI 0 0 30 57292 1

Page 47: Wait Events2

47

Finding Which Data Blocks Are Being Read

SQL> SELECT owner, segment_name, segment_type 2 FROM dba_extents 3 WHERE file_id = &absolute_file_number 4 AND &block_number BETWEEN block_id 5 AND block_id + blocks -1;

Enter value for absolute_file_number: 30Enter value for block_number: 62042

OWNER SEGMENT_NAME SEGMENT_TYPE----------------- ------------------- ------------ PRODMGR SAMPLES TABLE

Page 48: Wait Events2

48

The SAMPLES Table

Contained a LONG column with very large values

Excessive row chaining Most queries did not retrieve the

LONG data Table assigned to KEEP pool, but too

large to fit entirely in memory

Page 49: Wait Events2

49

Long-Term Problem Resolution

Convert the LONG column to a CLOB. Large CLOB data will be stored in a

separate LOB segment. Row chaining will be reduced or eliminated. The table segment will be much smaller and

more likely to fit in memory.

Page 50: Wait Events2

50

Short-Term Problem Resolution

Added index on most columns of SAMPLES table

– Allowed most queries to avoid table segment

Enlarged KEEP pool– Allowed index segment to fit in memory

Page 51: Wait Events2

51

Statspack Report Output Snap Id Snap Time Sessions ------- ------------------ -------- Begin Snap: 1192 20-Dec-02 13:00:49 102 End Snap: 1202 20-Dec-02 14:00:18 102 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time ------------------------ ---------- ---------- ------- direct path write 6,467 13,545 30.61 log file sync 4,914 7,493 16.93 library cache pin 1,175 6,090 13.76 direct path read 5,488 3,428 7.75 latch free 14,528 2,931 6.62

Page 52: Wait Events2

52

What We See in the Statspack Report Now

No db file sequential read or buffer busy waits.– All data was already in the buffer cache.

Physical reads reduced by over 90%. Total wait time on all non-idle events

reduced by over 98%.– Before: 12,786.49 / 0.5003 = 25,557.65 – After: 135.45 / 0.3061 = 442.50

Page 53: Wait Events2

53

What We Learned from Wait Event Information 

Large amounts of time were being spent waiting on single block disk reads and buffer contention in the buffer cache.

Random samples showed the disk reads and contention involved the SAMPLES table.

The buffer contention was the result of multiple sessions needing the same block from disk.

Wait events pointed us directly to the problem segment.

Page 54: Wait Events2

54

Example #2: More Buffer Busy Waits, Plus Latch Contention

A genetic research company stored their data in Oracle. Applications running concurrently on many workstations would fetch raw data, process it, and put the data back in the database. But throughput bogged down as they added more workstations.

Page 55: Wait Events2

55

Activating Wait Event Tracing

Added to application code on workstation #30: ALTER SESSION SET events

'10046 trace name context forever, level 8';

Could have used dbms_support if it was installed:

dbms_support.start_trace;

Modified application code to exit after 500 iterations

Page 56: Wait Events2

56

TKPROF Wait Events Reporting in Oracle 9i

tkprof prodgen_ora_16466.trc report_16466.prf waits=yes

Page 57: Wait Events2

57

TKPROF Report Output UPDATE processing_stationsSET status = 'ACTIVE', status_date = SYSDATE, data_set_id_being_processed = :b1WHERE station_id = 30call count cpu elapsed disk query current rows------- ------ ------- --------- ----- ----- ------- -----Parse 1 0.01 0.00 0 0 0 0Execute 500 0.23 10.14 0 3616 1010 500Fetch 0 0.00 0.00 0 0 0 0------- ------ ------- --------- ----- ----- ------- -----total 501 0.24 10.14 0 3616 1010 500Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited --------------------------- Waited ---------- ------------ buffer busy waits 26 0.71 7.87 latch free 17 0.57 2.08 log file switch completion 3 0.09 0.20

Page 58: Wait Events2

58

What We See In the TKPROF Report

500 trivial updates took 10.14 seconds Most of that time was spent waiting Dominant wait events:

– buffer busy waits– latch free

CPU time plus wait time does not add up to elapsed time due to round-off errors

 

Page 59: Wait Events2

59

Waits In the Trace File WAIT #2: nam='buffer busy waits' ela= 527727 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 498765 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 137611 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 124165 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 5237 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 264050 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 270177 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 330912 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 156317 p1=18 p2=10 p3=220WAIT #2: nam='buffer busy waits' ela= 710696 p1=18 p2=10 p3=220

Elapsed times are in microseconds in Oracle 9i

Page 60: Wait Events2

60

Finding Which Data Blocks Are Experiencing Buffer

Contention SQL> SELECT owner, segment_name, segment_type 2 FROM dba_extents 3 WHERE file_id = &absolute_file_number 4 AND &block_number BETWEEN block_id 5 AND block_id + blocks -1;

Enter value for absolute_file_number: 18Enter value for block_number: 10

OWNER SEGMENT_NAME SEGMENT_TYPE----------------- ------------------- ------------ GEN PROCESSING_STATIONS TABLE

Page 61: Wait Events2

61

Reason Codes from Metalink Bulletin #34405.1

P3 Reason Code

110 We want the CURRENT block either shared orexclusive but the Block is being read intocache by another session, so we have to waituntil their read() is completed.

130 Block is being read by another session and noother suitable block image was found, so wewait until the read is completed.

220 During buffer lookup for a CURRENT copy ofa buffer we have found the buffer but someoneholds it in an incompatible mode so we have towait.

Page 62: Wait Events2

62

What We Have Learned So Far

A buffer containing a data block of the PROCESSING_STATIONS table is experiencing contention.

The buffer in the buffer cache is busy because another session has the buffer in an incompatible mode.

All 26 buffer busy waits totaling 7.87 seconds involved the same data block.

Page 63: Wait Events2

63

The PROCESSING_STATIONS Table

SQL> SELECT SYSDATE - last_analyzed, blocks, 2 avg_row_len, avg_space, num_rows 3 FROM user_tables 4 WHERE table_name = 'PROCESSING_STATIONS'; SYSDATE- AVG_ AVG_ LAST_ANALYZED BLOCKS ROW_LEN SPACE NUM_ROWS ------------- ------ ------- ----- -------- 2.132118056 1 62 1686 100

Page 64: Wait Events2

64

Two Important Observations

There were 100 workstations running the processing application concurrently.

The trace we ran on workstation #30 completed in just under one minute.

Page 65: Wait Events2

65

Lots of Updates!

Workstation #30 updated the PROCESSING_STATIONS table 500 times in less than one minute.

If all 100 workstations do similar things: More than 50,000 updates to one data block every minute by 100 concurrent sessions!

Page 66: Wait Events2

66

Why So Many Updates? Workstations use the PROCESSING_STATIONS table to track which workstation is processing which

data set. Processing one data set takes between 0.1 second and 20 minutes. Workstations update the table frequently to keep the timestamp current. This would be helpful in the

event of a workstation crash.

Page 67: Wait Events2

67

Long-Term Problem Resolution

Modify the application to update the PROCESSING_STATIONS table less frequently—once per data set or once per second for larger data sets:– Will reduce updates by over 80%– Buffer busy waits will disappear or dramatically

decrease

Page 68: Wait Events2

68

Short-Term Problem Resolution

Rebuilt the PROCESSING_STATIONS table with PCTFREE set to 99:

– Oracle reserved 99% of each data block for future row expansion.

– Each row got its own data block.– Each workstation session now updates a

separate data block.

Page 69: Wait Events2

69

The Rebuilt PROCESSING_STATIONS

Table SQL> SELECT SYSDATE - last_analyzed, blocks,

2 avg_row_len, avg_space, num_rows3 FROM user_tables4 WHERE table_name = 'PROCESSING_STATIONS';

SYSDATE- AVG_ AVG_ LAST_ANALYZED BLOCKS ROW_LEN SPACE NUM_ROWS------------- ------ ------- ----- -------- .130868056 100 62 8014 100

Page 70: Wait Events2

70

TKPROF Report Output UPDATE processing_stationsSET status = 'ACTIVE', status_date = SYSDATE, data_set_id_being_processed = :b1WHERE station_id = 30call count cpu elapsed disk query current rows------- ------ ------- --------- ----- ----- ------- -----Parse 1 0.00 0.00 0 0 0 0Execute 500 0.20 2.22 0 500 1009 500Fetch 0 0.00 0.00 0 0 0 0------- ------ ------- --------- ----- ----- ------- -----total 501 0.20 2.22 0 500 1009 500Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited --------------------------- Waited ---------- ------------ latch free 2 0.35 0.61

Page 71: Wait Events2

71

What We See in the TKPROF Report Now

500 updates took 2.22 seconds, down from 10.14 seconds

No more buffer busy waits Waited 0.61 seconds on latches, down from 2.08

seconds CPU time was 0.20 seconds, down from 0.23

seconds 1.41 seconds unaccounted for—likely a mix of

waiting for CPU and round-off error

Page 72: Wait Events2

72

Understanding the Latch Free Event

SQL> SELECT parameter1, parameter2, parameter3 2 FROM v$event_name 3 WHERE name = 'latch free'; PARAMETER1 PARAMETER2 PARAMETER3------------ ------------ ------------address number tries       

address: Join to addr in v$latch       number: Join to latch# in v$latchname tries: Number of times the session has waited

while trying to acquire the latch

Page 73: Wait Events2

73

Waits In the Trace File

WAIT #2: nam='latch free' ela= 47004 p1=15113593728 p2=97 p3=0WAIT #2: nam='latch free' ela= 14629 p1=15113593728 p2=97 p3=1WAIT #2: nam='latch free' ela= 20652 p1=15113593728 p2=97 p3=2WAIT #2: nam='latch free' ela= 37737 p1=15113593728 p2=97 p3=3 

Four consecutive waits for one acquisition of the latch

Page 74: Wait Events2

74

Finding Which Latches Are Experiencing Contention

SQL> SELECT latch#, name 2 FROM v$latchname 3 WHERE latch# = &latch_number; Enter value for latch_number: 97  LATCH# NAME---------- -------------------- 97 cache buffers chains

Page 75: Wait Events2

75

What We Learned from Wait Event Information Much time was spent waiting on latch contention

and buffer contention in the buffer cache. The buffer contention was all for one data block. The buffer contention was the result of multiple

sessions needing to update the same data block. The latch contention involved the latch that protects

the buffer cache chains data structure. Wait events pointed us directly to the hot buffer in

the buffer cache.

Page 76: Wait Events2

76

Example #3: Log File Waits

A data warehouse loader application was tuned in a test environment until it met user acceptance. The production server was larger and more powerful, but the data loads actually took longer in production than in the test environment.

Page 77: Wait Events2

77

Summarizing Wait Events During A Period of Time v$system_event shows wait event totals since instance

startup. v$session_event shows wait event totals since the

beginning of a session. You can capture view contents at different points in time

and compute the delta in order to get wait event information for a specific period of time.

Statspack and many third-party tools can do this, but a simple script has less overhead and can be quicker to deploy.

Page 78: Wait Events2

78

Simple Script to See Wait Events During a 30 Second

Period CREATE TABLE previous_events ASSELECT SYSDATE timestamp, v$system_event.*FROM v$system_event;EXECUTE dbms_lock.sleep (30);SELECT A.event, A.total_waits - NVL (B.total_waits, 0) total_waits, A.time_waited - NVL (B.time_waited, 0) time_waitedFROM v$system_event A, previous_events BWHERE A.event NOT IN (list of idle events)AND B.event (+) = A.eventORDER BY time_waited;

Page 79: Wait Events2

79

Wait Events During 30 Seconds

of Data Loading EVENT TOTAL_WAITS TIME_WAITED---------------------------- ----------- -----------control file sequential read 61 1latch free 2 1db file sequential read 6 7control file parallel write 41 31log file single write 6 164db file parallel write 13 220enqueue 6 486log buffer space 24 2007log file sequential read 30 2655log file switch completion 33 2883log file parallel write 19 3561log file sync 113 10249

Page 80: Wait Events2

80

What We See in the Script Output

Over 215 seconds spent waiting on log-related events:– Sessions waited 102 seconds for LGWR to

flush the log buffer to disk for a commit– Sessions waited 48 seconds for LGWR to make

space for more redo– LGWR waited 37 seconds for disk writes– ARCH waited 26 seconds for disk reads

Page 81: Wait Events2

81

Investigate the Redo Log

Check production online redo log location for contention and slow hardware:– All log files located on one disk– Same disk held hot files for another database

Compare to test environment:– Log files located on a striped volume– Minimal other activity on the volume– Database in NOARCHIVELOG mode

Page 82: Wait Events2

82

Problem Resolution

Sped up online redo log file performance:– Moved log files to dedicated disks– Spread log files over multiple disks

Page 83: Wait Events2

83

What We Learned from Wait Event Information The disk I/O speed writing and reading the

online redo log files was the bottleneck slowing down the data warehouse load.

Wait events pointed us directly to the area within Oracle that was holding up the works.

Page 84: Wait Events2

84

Example #4: Direct Path I/O Waits

Analysts in a customer service unit were satisfied with the response time when they queried individual customer orders from their data warehouse. However, queries involving summarizations of multiple orders were unacceptably slow.

Page 85: Wait Events2

85

Database Rx Wait Event Report

Page 86: Wait Events2

86

What We See in the Database Rx Report

Dominant wait events:– direct path write– db file scattered read– direct path read

Above account for 99% of non-idle event wait time

Insignificant db file sequential read waits

Page 87: Wait Events2

87

What We Have Learned So Far

Large amount of direct path I/O activity– Usually involves temporary segments

Significant multi-block I/O reads– Full table scans are common in a data

warehouse environment

Insignificant single-block I/O reads– Frequently accessed data blocks probably in

buffer cache

Page 88: Wait Events2

88

Understanding the Direct Path I/O Events

SQL> SELECT name, parameter1, parameter2, parameter3 2 FROM v$event_name 3 WHERE name LIKE 'direct path%';

NAME PARAMETER1 PARAMETER2 PARAMETER3----------------- ----------- ---------- ----------direct path read file number first dba block cntdirect path write file number first dba block cnt

file number: File containing data block first dba: First block within the file to be accessed block cnt: Number of blocks to be accessed

Page 89: Wait Events2

89

Finding Which Files Are Being Accessed

SQL> SELECT sid, event, state, seconds_in_wait, 2 wait_time, p1, p2, p3 3 FROM v$session_wait 4 WHERE event = 'direct path write' 5 ORDER BY sid;

SID EVENT STATE SEC TIME P1 P2 P3--- ----------------- ----- --- ---- ----- ----- ----- 39 direct path write WAITI 0 0 201 65 7 47 direct path write WAITI 0 0 201 2248 7

Page 90: Wait Events2

90

Finding Which Files Are Being Accessed

SQL> SELECT tablespace_name, file_id "AFN" 2 FROM dba_data_files 3 WHERE file_id = 201;

no rows selected

SQL> SELECT tablespace_name, file_id + value "AFN" 2 FROM dba_temp_files, v$parameter 3 WHERE name = 'db_files' 4 AND file_id + value = 201;

TABLESPACE_NAME AFN ------------------------------ ----------TEMP 201

Page 91: Wait Events2

91

Problem Resolution Increased sort_area_size:

– Was set to default of 65536– Increased to 10485760 (few concurrent sessions)

If that had not solved the problem:– Tune application code to reduce sorting– Check for other active files on disks holding temp

files– Move temp files to a striped volume

Page 92: Wait Events2

92

What We Learned from Wait Event Information Direct path I/O accounted for 75% of the non-idle

event wait time on the system. Multi-block reads accounted for 24% of the non-

idle event wait time—not unusual in a data warehouse environment.

Random samples showed direct path I/O involved the temporary tablespace.

Wait events pointed us directly to the area within Oracle that needed adjustment.

Page 93: Wait Events2

93

Logical vs. Physical Reads

Page 94: Wait Events2

94

Logical vs. Physical Reads

During the Database Rx sample interval there were more physical reads than logical reads.

Direct path reads count as physical reads but not logical reads.

Be careful how you compute your buffer cache hit ratios—in this example you might come up with a negative figure!

Page 95: Wait Events2

95

Example #5: Database Link Wait Events

A company had five Oracle databases, one per region. Due to human error, the same customer transactions would sometimes get loaded into multiple databases. A report was built to identify these duplicates, but it took 30 minutes to run.

Page 96: Wait Events2

96

Isolating a Query and Analyzing Its Wait Events Start a new database session in SQL*Plus or a

similar tool. Run the query. Monitor the session’s wait events and statistics from

another session:– v$session_event– v$sesstat

This is a handy technique when you know which statement is the bottleneck.

Page 97: Wait Events2

97

Query Output from v$session_event

SQL> SELECT event, total_waits, time_waited, max_wait 2 FROM v$session_event 3 WHERE sid = 47 4 ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT--------------------------- ----------- ----------- ----------SQL*Net message from client 32 4435 2432SQL*Net message from dblink 1525516 104919 31SQL*Net message to client 33 0 0SQL*Net message to dblink 1525516 466 9db file sequential read 27199 8025 28latch free 40 5 4log file sync 1 2 2

Page 98: Wait Events2

98

Query Output from v$sesstat

SQL> SELECT A.name, B.value 2 FROM v$statname A, v$sesstat B 3 WHERE A.statistic# = 12 4 AND B.statistic# = A.statistic# 5 AND B.sid = 47;

NAME VALUE ------------------------------ ---------- CPU used by this session 67937

Page 99: Wait Events2

99

What We See In the v$ Data

1.5 million waits on network roundtrips through a database link: 1053 seconds– Network latency– Time for the remote database to respond to

each request

27,000 waits for single-block disk reads: 80 seconds

Page 100: Wait Events2

100

The Query We Are Studying

SELECT customer_id, batch_serial_number, batch_date, load_date, batch_comment, control_totalFROM customer_xfer_batches AWHERE exists (SELECT 1 FROM customer_xfer_batches@prdwest B WHERE B.customer_id = A.customer_id AND B.batch_serial_number = A.batch_serial_number)ORDER BY customer_id, batch_serial_number;

Page 101: Wait Events2

101

The Query We Are Studying

Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT 1 0 FILTER 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'CUSTOMER_XFER_BATCHES' 3 2 INDEX (FULL SCAN) OF 'CUST_XFER_BAT_PK' (UNIQUE) 4 1 REMOTE* PRDWEST

4 SERIAL_FROM_REMOTE SELECT "CUSTOMER_ID","BATCH_SERIAL_NUMBER" FROM "CUSTOMER_XFER_BATCHES" "B" WHERE "BATCH_SERIAL_NUMBER"=:1 AND "CUSTOMER_ID"=:2

Page 102: Wait Events2

102

CUSTOMER_XFER_BATCHES

SQL> SELECT blocks, num_rows 2 FROM user_tables 3 WHERE table_name = 4 'CUSTOMER_XFER_BATCHES'; BLOCKS NUM_ROWS------ -------- 21825 1526003

Page 103: Wait Events2

103

What We Have Learned So Far

Oracle is doing a full scan of the index on the local table and fetching each row one at a time– This does avoid a sort– Very high price to pay to skip sorting a few rows

Oracle is doing one remote query for each row fetched from the local table

Page 104: Wait Events2

104

Problem Resolution - Part 1SELECT customer_id, batch_serial_number, batch_date, load_date, batch_comment, control_totalFROM customer_xfer_batchesWHERE (customer_id, batch_serial_number) IN (SELECT customer_id, batch_serial_number FROM customer_xfer_batches INTERSECT SELECT customer_id, batch_serial_number FROM customer_xfer_batches@prdwest)ORDER BY customer_id, batch_serial_number;

Page 105: Wait Events2

105

Query Output from v$session_event

SQL> SELECT event, total_waits, time_waited, max_wait 2 FROM v$session_event 3 WHERE sid = 49 4 ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT--------------------------- ----------- ----------- ----------SQL*Net message from client 46 3680 2481SQL*Net message from dblink 24 31 18SQL*Net message to client 47 0 0SQL*Net message to dblink 24 0 0SQL*Net more data from dbli 5978 1337 13db file scattered read 3430 675 8db file sequential read 182 60 2direct path read 148 233 11direct path write 920 3572 33

Page 106: Wait Events2

106

Query Output from v$sesstat

SQL> SELECT A.name, B.value 2 FROM v$statname A, v$sesstat B 3 WHERE A.statistic# = 12 4 AND B.statistic# = A.statistic# 5 AND B.sid = 49;

NAME VALUE ------------------------------ ---------- CPU used by this session 3227

Page 107: Wait Events2

107

What We See in the v$ Data Now 24 network roundtrips through a database link

instead of 1.5 million: 14 seconds (down from 1053)– Fewer, larger network packets– Fewer requests to remote database

3,600 waits on mostly multi-block disk reads instead of 27,000 waits on single-block disk reads: 7 seconds (down from 80)– Fewer multi-block reads instead of many single-block

reads

Page 108: Wait Events2

108

What We See in the v$ Data Now

1100 waits on direct path I/O: 38 seconds (new)– Sorting to implement the INTERSECT operation

32 seconds of CPU time (down from 679)– Fewer logical reads and network roundtrips

Elapsed time: 92 seconds (down from

31 minutes)

Page 109: Wait Events2

109

Iterative Tuning

Curing one bottleneck often reveals or creates another, smaller bottleneck.

Repeat the wait event evaluation process after each change until performance goals are met.

In this situation, a 95% reduction in runtime from 31 minutes to 92 seconds still did not meet the performance goal.

Page 110: Wait Events2

110

What We Have So Far

Rewritten query completes in 92 seconds:– 32 CPU seconds– 38 seconds of wait on direct path I/O– 14 seconds of wait on network roundtrips– 7 seconds of wait on multi-block and single-

block reads

Page 111: Wait Events2

111

Problem Resolution - Part 2

Eliminating or speeding up direct path I/O seems like the logical next step:– sort_area_size set to 1 Mb– Try dynamically changing it to 100 Mb?

Page 112: Wait Events2

112

Query Output from v$session_event

SQL> SELECT event, total_waits, time_waited, max_wait 2 FROM v$session_event 3 WHERE sid = 46 4 ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT--------------------------- ----------- ----------- ----------SQL*Net message from client 47 442 287SQL*Net message from dblink 25 25 14SQL*Net message to client 48 0 0SQL*Net message to dblink 25 0 0SQL*Net more data from dbli 6050 1378 26db file scattered read 3430 945 8db file sequential read 191 59 1log file sync 1 3 3

Page 113: Wait Events2

113

Query Output from v$sesstat

SQL> SELECT A.name, B.value 2 FROM v$statname A, v$sesstat B 3 WHERE A.statistic# = 12 4 AND B.statistic# = A.statistic# 5 AND B.sid = 46;

NAME VALUE ------------------------------ ---------- CPU used by this session 3296

Page 114: Wait Events2

114

What We See in the v$ Data Now

Waits on network roundtrips through a database link, multi-block reads, and single-block reads unchanged

CPU time used unchanged Direct path I/O waits eliminated completely

– Entire sort now performed in memory

Elapsed time: 55 seconds (down from 92)

Page 115: Wait Events2

115

What We Learned from Wait Event Information

A query ran slowly due to excessive network roundtrips and single-block reads.

After these problems were corrected, 40% of the query execution time was devoted to sorting to disk.

Wait events showed us how Oracle was spending its time while executing the query, helping us improve the query’s performance in an iterative fashion.

Page 116: Wait Events2

116

A Summary Of Wait Event Techniques

Using Statspack snapshots and reports to analyze wait events at the instance level

Polling v$session_wait to determine which buffers or latches have contention

Enabling wait event tracing in a session Using Oracle 9i TKPROF to tabulate waits

at the statement level within one session

Page 117: Wait Events2

117

A Summary Of Wait Event Techniques (continued)

Collecting wait event data for a session or the entire instance at two different times and computing the difference to find the wait events during a specific period of time

Ranking cumulative wait event data in order to see which wait events account for the most wait time

Isolating a statement and analyzing its wait events

Page 118: Wait Events2

118

Send Us Your Wait Event Puzzles

We are always looking for interesting wait event situations to learn from!

If you are trying to diagnose a problem using the wait event interface, feel free to email us wait events data and a problem description.

We’ll do our best to look over what you send us and share our thoughts with you.

Page 119: Wait Events2

119

The White Paper

A companion white paper to this presentation is available for free download from our company’s website at:

www.dbspecialists.com/presentations.html

Page 120: Wait Events2

120

Resources from Database Specialists

The Specialist newsletter– www.dbspecialists.com/specialist.html

Database Rx®

– dbrx.dbspecialists.com/guest• Provides secure, automated monitoring, alert

notification, and analysis of your Oracle databases

Page 121: Wait Events2

121

In Conclusion The wait event interface gives you access to a

detailed accounting of how Oracle processes spend their time.

Wait events touch all aspects of the Oracle database server.

The wait event interface will not always give you the answer to every performance problem, but it will just about always give you insights that guide you down the proper path to problem resolution.

Page 122: Wait Events2

122

Contact InformationRoger Schrag

[email protected]

Terry [email protected]

Database Specialists, Inc.388 Market Street, Suite 400San Francisco, CA 94111Tel: 415/344-0500Web: www.dbspecialists.com