profiling the logwriter and database writer - · pdf file©2013enkitec&...
TRANSCRIPT
![Page 1: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/1.jpg)
©2013 Enkitec
Profiling the logwriter and database writer
Frits Hoogland UKOUG Tech 2014
1
This is the font size used for showing screen output. Be sure this is readable for you.
This is the font used to accentuate text/console output. Make sure this is readable for you too!
![Page 2: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/2.jpg)
`whoami`
• Frits Hoogland • Working with Oracle products since 1996
• Blog: hIp://fritshoogland.wordpress.com • TwiIer: @fritshoogland • Email: [email protected] • Oracle ACE Director • OakTable Member
2
![Page 3: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/3.jpg)
Goals & prerequisites
• Goal: learn about typical behaviour of both lgwr and dbwr, both visible (wait events) and inner-‐working.
• Prerequisites: – Understanding of (internal) execu\on of C programs. – Understanding of Oracle tracing mechanisms. – Understanding of interac\on between processes inside the Oracle database.
3
![Page 4: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/4.jpg)
Test system
• The tests and inves\ga\on is done in a VM: – Host: Mac OSX 10.9 / VMWare Fusion 6.0.2. – VM: Oracle Linux x86_64 6u5 (UEK3 3.8.13). – Oracle Grid 11.2.0.4 with ASM/External redundancy. – Oracle database 11.2.0.4.
– Unless specified otherwise.
4
![Page 5: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/5.jpg)
Logwriter, concepts guide
• From the concepts guide: – The lgwr manages the redolog buffer
– The lgwr writes all redo entries that have been copied in the buffer since the last \me it wrote when: • User commits. • Logswitch. • Three seconds since last write*. • Buffer 1/3th full or 1MB filled. • dbwr must write modified (‘dirty’) buffers*.
5
![Page 6: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/6.jpg)
Logwriter, idle
• The general behaviour of the log writer can easily be shown by puhng a 10046/8 on lgwr:
SYS@v11204 AS SYSDBA> @who
…
130,1,@1 2243 [email protected] (LGWR)
…
SYS@v11204 AS SYSDBA> oradebug setospid 2243
Oracle pid: 11, Unix process pid: 2243, image: [email protected] (LGWR)
SYS@v11204 AS SYSDBA> oradebug unlimit
Statement processed.
SYS@v11204 AS SYSDBA> oradebug event 10046 trace name context forever, level 8;
Statement processed.
6
![Page 7: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/7.jpg)
Logwriter, idle
• The 10046/8 trace shows: *** 2013-12-18 14:12:32.479
WAIT #0: nam='rdbms ipc message' ela= 2999925 timeout=300 p2=0 p3=0 obj#=-1 tim=1387372352479352
*** 2013-12-18 14:12:35.479
WAIT #0: nam='rdbms ipc message' ela= 3000075 timeout=300 p2=0 p3=0 obj#=-1 tim=1387372355479531
*** 2013-12-18 14:12:38.479
WAIT #0: nam='rdbms ipc message' ela= 2999755 timeout=300 p2=0 p3=0 obj#=-1 tim=1387372358479381
*** 2013-12-18 14:12:41.479
WAIT #0: nam='rdbms ipc message' ela= 3000021 timeout=300 p2=0 p3=0 obj#=-1 tim=1387372361479499
7
![Page 8: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/8.jpg)
Logwriter, idle
• “rdbms ipc message” indicates a sleep/idle event – There isn’t an indica\on lgwr writes something:
semtimedop(327683, {{15, -1, 0}}, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
getrusage(RUSAGE_SELF, {ru_utime={0, 84000}, ru_stime={0, 700000}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 84000}, ru_stime={0, 700000}, ...}) = 0
times({tms_utime=8, tms_stime=70, tms_cutime=0, tms_cstime=0}) = 431286151
times({tms_utime=8, tms_stime=70, tms_cutime=0, tms_cstime=0}) = 431286151
times({tms_utime=8, tms_stime=70, tms_cutime=0, tms_cstime=0}) = 431286151
times({tms_utime=8, tms_stime=70, tms_cutime=0, tms_cstime=0}) = 431286151
semtimedop(327683, {{15, -1, 0}}, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
…etc…
8
![Page 9: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/9.jpg)
Logwriter, idle
• It does look in the /proc filesystem to the ‘stat’ file of a certain process:
open("/proc/2218/stat", O_RDONLY) = 21
read(21, "2218 (oracle) S 1 2218 2218 0 -1"..., 999) = 209
close(21)
• It does so every 20th \me (20*3)= 60 sec. • The PID is PMON.
9
![Page 10: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/10.jpg)
Logwriter, idle
• Recap: – In an idle database. – The lgwr sleeps on a semaphore for 3 seconds.
• Then wakes up, and sets up the semaphore/sleep again. • Processes sleeping on a semaphore do not spend CPU
– Every minute, lgwr reads pmon's process stats. – lgwr doesn’t write if there’s no need.
• But what happens when we insert a row of data, and commit that?
10
![Page 11: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/11.jpg)
Logwriter, commitTS@//localhost/v11204 > insert into t values ( 1, 'aaaa', 'bbbb' );
1 row created.
TS@//localhost/v11204 > commit;
Commit complete.
11
![Page 12: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/12.jpg)
Logwriter, commit -‐ expected
12
time
foreground
logwritersemtimedop(458755, {{15, -1, 0}}, 1, {3, 0})
semctl(458755, 15, SETVAL, 0x7fff00000001)
semctl(458755, 33, SETVAL, 0x1)
semtimedop(458755, {{33, -1, 0}}, 1, {0, 100000000})
commit;
io_submit(139981752844288, 1, {{0x7f5008e23480, 0, 1, 0, 256}})
io_getevents(139981752844288, 1, 128, {{0x7f5008e23480, 0x7f5008e23480, 3584, 0}},
{600, 0})
![Page 13: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/13.jpg)
Logwriter, commit -‐ actual
13
time
foreground
logwritersemtimedop(458755, {{15, -1, 0}}, 1, {3, 0})
semctl(458755, 15, SETVAL, 0x7fff00000001)
commit;
io_submit(139981752844288, 1, {{0x7f5008e23480, 0, 1, 0, 256}})
io_getevents(139981752844288, 1, 128, {{0x7f5008e23480, 0x7f5008e23480, 3584, 0}},
{600, 0})
no ‘log file sync’ wait!
No semctl()
No semtimedop()
![Page 14: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/14.jpg)
Logwriter, commit
• Inves\ga\on shows: – Foreground scans log writer progress up to 3 \mes.
• kcrf_commit_force() > kcscur3()
– If its data* in the redo log buffer is not wriIen: • It no\fies the lgwr that it is going to sleep on a semaphore. • sem\medop() for 100ms, un\l posted by lgwr.
– If its data* has been wriIen: • No need to wait on it. • No ‘log file sync’ wait.
14
![Page 15: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/15.jpg)
Logwriter, commit
• Wait!!! – This (no log file sync) turned out to be an edge case.
• I traced the kcrf_commit_force() and kcscur3() calls using breaks in gdb.
– In normal situa\ons, the wait will appear. • Depending on log writer and FG progress. • The sem\medop() call in the FG can be absent.
– As a result, lgwr will not semctl()
15
![Page 16: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/16.jpg)
Logwriter, commit -‐ post-‐wait
16
time
foreground
logwritersemtimedop(458755, {{15, -1, 0}}, 1, {3, 0})
semctl(458755, 15, SETVAL, 0x7fff00000001)
commit;
io_submit(139981752844288, 1, {{0x7f5008e23480, 0, 1, 0, 256}})
io_getevents(139981752844288, 1, 128, {{0x7f5008e23480, 0x7f5008e23480, 3584, 0}},
{600, 0})
log file parallel writerdbms ipc message
kcrf_commit_force()
kcscur3()
semtimedop(458755, {{33, -1, 0}}, 1, {0, 100000000})
log file sync Grayed means: optional
![Page 17: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/17.jpg)
adap\ve log file sync
• Feature of Oracle 11.2 – Parameter '_use_adap\ve_log_file_sync'
• Set to FALSE up to 11.2.0.2 • Set to TRUE star\ng from 11.2.0.3 • Third value ‘POLLING_ONLY’
– Makes Oracle adap\vely switch between ‘post-‐wait’ and polling.
– The log writer writes a no\fica\on in its logfile if it switches between modes (if param = ‘TRUE’)
17
![Page 18: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/18.jpg)
Logwriter, commit -‐ polling
18
time
foreground
logwritersemtimedop(458755, {{15, -1, 0}}, 1, {3, 0})
semctl(458755, 15, SETVAL, 0x7fff00000001)
commit;
io_submit(139981752844288, 1, {{0x7f5008e23480, 0, 1, 0, 256}})
io_getevents(139981752844288, 1, 128, {{0x7f5008e23480, 0x7f5008e23480, 3584, 0}},
{600, 0})
log file parallel writerdbms ipc message
kcrf_commit_force
kcscur3
log file sync
nanosleep({0, 9409000}, 0x7fff64725480)
nanosecs; varies
![Page 19: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/19.jpg)
Logwriter, post-‐wait vs. polling
– No wait event ‘log file sync’ if: • Lgwr was able to flush the commiIed data before the foreground has issued kcscur3() 2/3 \mes in kcrf_commit_force().
– If not, the foreground starts a ‘log file sync’ wait. • If in “post-‐wait” mode (default), it will record it’s wai\ng state in the post-‐wait queue, sleep in sem\medop() for 100ms at a \me, wai\ng to be posted by lgwr.
• If in “polling” mode, it will sleep in nanosleep() for computed \me*, then check lgwr progress, if lgwr write has progressed beyond its commiIed data SCN: end wait, else start sleeping in nanosleep() again.
19
![Page 20: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/20.jpg)
Logwriter
• The main task of lgwr is to flush data in the logbuffer to disk. – The lgwr is idle when wai\ng on ‘rdbms ipc message’. – There are two main* indicators of lgwr busyness:
• CPU \me. • Wait event ‘log file parallel write’.
• The lgwr needs to be able to get onto the CPU in order to do process!
20
![Page 21: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/21.jpg)
Logwriter -‐ idle
21
logwritersemtimedop(458755, {{15, -1, 0}}, 1, {3, 0})
rdbms ipc message rdbms ipc message rdbms ipc message
![Page 22: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/22.jpg)
Logwriter -‐ idle
22
logwriter
rdbms ipc message rdbms ipc message
![Page 23: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/23.jpg)
Logwriter -‐ idle
23
logwriterIdle mode latch gets:
‘messages’ ‘mostly latch-free SCN’ ‘lgwr LWN SCN’ ‘KTF sga latch’ ‘redo allocation’ ‘messages’
![Page 24: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/24.jpg)
Logwriter -‐ wri\ng
24
logwriterWrite mode latch gets and frees:
‘messages’ ‘mostly latch-free SCN’ ‘lgwr LWN SCN’ ‘KTF sga latch’ ‘redo allocation’ ‘messages’ ‘redo writing’*
![Page 25: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/25.jpg)
Logwriter -‐ wri\ng
25
logwriterio_submit(139981752844288, n,
{{0x7f5008e23480, 0, 1, 0, 256}})
log file parallel write
io_getevents(139981752844288, n, 128, {{0x7f5008e23480, 0x7f5008e23480, 3584, 0}},
{0, 0})
io_getevents(139981752844288, n, 128, {{0x7f5008e23480, 0x7f5008e23480, 3584, 0}},
{600, 0})
With the linux ‘strace’ utility, the non-blocking syscall is visible OR the blocking one syscall is visible.
![Page 26: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/26.jpg)
• The event ‘log file parallel write’ is an indicator of IO wait \me for the lgwr. – NOT IO latency \me!!
Logwriter -‐ wri\ng
26
io_submit() io_getevents() io_getevents()
real IO time: 10ms
real IO time: 7ms
log file parallel write: 9.85ms
![Page 27: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/27.jpg)
Logwriter -‐ wri\ng -‐ ASM
27
io_submit()
io_getevents() timeout 600s
11.2.0.1
11.2.0.3
11.2.0.4
io_getevents() timeout 0s
log file parallel write
11.2.0.2
12.1.0.1
12.1.0.2
![Page 28: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/28.jpg)
Logwriter -‐ wri\ng -‐ filesystem
28
io_submit()
io_getevents() timeout 600s
11.2.0.1
11.2.0.2
11.2.0.3
11.2.0.4
log file parallel write
12.1.0.1
12.1.0.2
![Page 29: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/29.jpg)
Logwriter -‐ wri\ng -‐ 12c
• Actually, the version 12 schema is a lie.
• A default Oracle 12 database uses a new feature: – Adap\ve scalable lgwr workers
• Which means you got a master lgwr process, – and log writer slaves (lgnn processes)
29
![Page 30: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/30.jpg)
Logwriter -‐ wri\ng -‐ 12c
• The adap\ve scalable lgwr workers feature is controlled by the parameter: • _use_single_log_writer
• ‘ADAPTIVE’ (default), ‘TRUE’, ‘FALSE’.
• I did set it to ‘TRUE’ to revert to 11g behaviour.
30
![Page 31: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/31.jpg)
Logwriter -‐ wri\ng -‐ 12c
• When adap\ve, on my system*: • lgwr process. • 2 lgnn processes.
• ‘adap\ve’ means either the lgwr process or the slave(s) write.
31
![Page 32: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/32.jpg)
Logwriter -‐ wri\ng -‐ 12c
• The log writer trace file tells what is happening: • Wri\ng is moved from LGWR to slaves:
*** 2014-11-15 13:16:04.003
kcrfw_slave_adaptive_updatemode: single->scalable redorate=182371 switch=37881
*** 2014-11-15 13:16:04.004
Adaptive scalable LGWR enabling workers
• Wri\ng is moved from slaves to LGWR: *** 2014-11-15 13:16:25.065
kcrfw_slave_adaptive_updatemode: scalable->single group0=2527 all=2552 rw=3957 single=3100 scalable_nopipe=7914 scalable_pipe=4352 scalable=7878
*** 2014-11-15 13:16:25.065
Adaptive scalable LGWR disabling workers
32
![Page 33: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/33.jpg)
Logwriter -‐ wri\ng -‐ 12c
• The waits are a bit different between single and scalable mode: • Single (LGWR) writes are discussed in this presenta\on • The lgnn processes wait for
– ‘LGWR worker group idle‘ forever.
• This means the wait \me is either startup or last \me they wrote.
33
![Page 34: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/34.jpg)
Logwriter -‐ wri\ng -‐ 12c
• The waits are a bit different between single and scalable mode: • In scalable mode, LGWR receives write req.
• LGWR semctl’s one or more slave’s to write. • Then sleeps in ‘rdbms ipc message’.
• The lgnn processes wakes up, and writes. – io_submit&io_getevents in wait ‘log file parallel write’.
– semctl’s FG once ready. 34
![Page 35: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/35.jpg)
Logwriter -‐ wri\ng -‐ 12c
• In scalable mode: • I suspended execu\on of the slaves. • Aver some \me, this is no\ced by LGWR
• Wait ‘target log write size’. • Wait ‘LGWR all worker groups’.
35
![Page 36: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/36.jpg)
Logwriter wait events
• rdbms ipc message – \meout: 300 (cen\seconds; 3 seconds). – process sleeping ~ 3 seconds on semaphore.
• log file parallel write – files: number of log file members. – blocks: total number of log blocks wriIen. – requests: ?
• I’ve seen this differ from the actual numer of IO requests.
36
![Page 37: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/37.jpg)
Logwriter wait events
• Let’s switch the database to synchronous IO. – Some plazorms have difficulty with AIO (HPUX!) – Got to check if your config does use AIO.
• Found out by accident that ASM+NFS has no AIO by default.
– Good to understand what the absence of AIO means.
• If you can’t use AIO today, you are doing it WRONG!
37
![Page 38: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/38.jpg)
log file parallel write
38
disk_asynch_io=FALSE (no AIO) kslwtbctx
semtimedop - 458755 semid, timeout: $62 = {tv_sec = 3, tv_nsec = 0}
kslwtectx -- Previous wait time: 584208: rdbms ipc message
pwrite64 - fd, size - 256,512
pwrite64 - fd, size - 256,512
kslwtbctx
kslwtectx -- Previous wait time: 782: log file parallel write
kslwtbctx
semtimedop - 458755 semid, timeout: $63 = {tv_sec = 2, tv_nsec = 310000000}
kslwtectx -- Previous wait time: 2315982: rdbms ipc message
timeout = 3s wait time = 0.584208s
=> semaphore is posted!
two sequential writes (two logfiles)
the wait begins AFTER the write (!) it’s also suspiciously fast (0.8ms)
![Page 39: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/39.jpg)
log file parallel write
39
disk_asynch_io=FALSE (no AIO) Let’s add 100ms to the IOs (shell sleep 0.1) kslwtbctx
semtimedop - 458755 semid, timeout: $3 = {tv_sec = 2, tv_nsec = 900000000}
kslwtectx -- Previous wait time: 2905568: rdbms ipc message
pwrite64 - fd, size - 256,512
>sleep 0.1
pwrite64 - fd, size - 256,512
>sleep 0.1
kslwtbctx
kslwtectx -- Previous wait time: 545: log file parallel write
Two writes again. In the break a sleep of 100ms is
added. This should make the timing at
least 200’000
The timing is 545 (0.5ms): timing is off.
![Page 40: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/40.jpg)
log file parallel write -‐ 12c
40
disk_asynch_io=FALSE: 12.1.0.2 semtimedop semid:557061, timeout:3,0
kskthewt (151)
kskthbwt (137)
pwrite64 fd:256, buf:96499c00, size:1536, offset:207200256
pwrite64 fd:256, buf:96499c00, size:1536, offset:100245504
kskthewt (137)
semctl semid:557061, semnum:16, cmd:16
kskthbwt (151)
semtimedop semid:557061, timeout:3,0
Semaphore sleep
IO calls are now INSIDE the wait!
![Page 41: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/41.jpg)
log file parallel write
• Conclusion: – For at least Oracle version 11.2. – When synchronous IO (pwrite64()) is issued.
• disk_asynch_io = FALSE (ASM) • filesystemio_op\ons != “setall” or “asynch”
– The wait event does not \me the IO requests.
• How about the other log writer wait events?
41
![Page 42: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/42.jpg)
control file sequen\al read
42
disk_asynch_io=FALSE (no AIO) kslwtbctx
pread64 - fd, size - 256,16384
>sleep 0.1
kslwtectx -- Previous wait time: 100323: control file sequential read
This event is correctly \med.
![Page 43: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/43.jpg)
control file parallel write
43
disk_asynch_io=FALSE (no AIO) pwrite64 - fd, size - 256,16384
>sleep 0.1
kslwtbctx
kslwtectx -- Previous wait time: 705: control file parallel write
This event is incorrectly \med!
![Page 44: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/44.jpg)
log file single write
44
disk_asynch_io=FALSE (no AIO) kslwtbctx
pwrite64 - fd, size - 256,512
>sleep 0.1
kslwtectx -- Previous wait time: 104594: log file single write
This event is correctly \med.
![Page 45: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/45.jpg)
Logwriter wait events logswitch
• Some of these waits typically show up during a logswitch. – This are all the waits which are normally seen:
• os thread startup (semctl()-‐sem\medop()) • control file sequen\al read (pread64()) • control file parallel write (io_submit()-‐io_getevents()) • log file sequen\al read (pread64()) • log file single write (pwrite64()) • KSV master wait (semctl() post to dbwr)
• This is with AIO enabled!45
![Page 46: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/46.jpg)
Logwriter, \meout message
• Warning:
Warning: log write elapsed time 523ms, size 2760KB
• Printed in logwriter tracefile (NOT alert.log) • This is instrumented with the ‘log write parallel write’ event.
• Threshold set with parameter: – _side_channel_batch_\meout_ms (500ms)
46
![Page 47: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/47.jpg)
Logwriter, \meout message
• Warning (RAC!):
Warning: log write broadcast wait time 2913ms (SCN 0xb86.cd638134)
• Printed in logwriter tracefile (NOT alert.log) • This is instrumented with the ‘wait for scn ack’ event.
47
![Page 48: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/48.jpg)
Logwriter: disable logging
• The “forbidden switch”: _disable_logging – Do not use this for anything else than tests!
• Everything is done the same — no magic – Except the write by the lgwr to the logfiles – No ‘log file parallel write’ – Redo/control/data files are synced with shut normal
• A way to test if lgwr IO influences db processing
48
![Page 49: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/49.jpg)
Logwriter: exadata
• How does this look like on Exadata?
49
![Page 50: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/50.jpg)
Logwriter: exadatakslwtbctx semtimedop - 3309577 semid, timeout: $24 = {tv_sec = 2, tv_nsec = 970000000} kslwtectx -- Previous wait time: 2973630: rdbms ipc message
$25 = "oss_write" $26 = "oss_write" $27 = "oss_write" $28 = "oss_write"
kslwtbctx $29 = "oss_wait" $30 = "oss_wait" $31 = "oss_wait" $32 = "oss_wait" kslwtectx -- Previous wait time: 2956: log file parallel write
kslwtbctx semtimedop - 3309577 semid, timeout: $33 = {tv_sec = 3, tv_nsec = 0} kslwtectx -- Previous wait time: 3004075: rdbms ipc message
50
The dbwr semaphore sleep.
The writes are issued here. There is no io_submit like wait. This is not timed.
The wait is log file parallel write, identical to non-exadata. It seems to wait for all issued IOs
![Page 51: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/51.jpg)
Logwriter: exadata -‐ 12csemtimedop kskthewt (151)
kskthbwt (137) oss_write oss_write oss_write oss_write oss_wait oss_wait oss_wait oss_wait kskthewt (137)
semctl
kskthbwt (151) semtimedop
51
The semaphore sleep.
With 12c, both the submit of the IO and the waiting for it to finish is timed.
The wait is log file parallel write, identical to non-exadata. It seems to wait for all issued IOs
![Page 52: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/52.jpg)
Database writer
• From the Oracle 11.2 concepts guide: – The DBWn process writes dirty buffers to disk under the following condi\ons: • When a server process cannot find a clean reusable buffer aver scanning a threshold of buffers, it signals DBWn to write. DBWn writes dirty buffers to disk asynchronously if possible while performing other processing.
• DBWn periodically writes buffers to advance the checkpoint, which is the posi\on in the redo thread from which instance recovery begins. The log posi\on of the checkpoint is determined by the oldest dirty buffer in the buffer cache.
52
![Page 53: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/53.jpg)
Database writer, idle
• The 10046/8 trace shows: *** 2013-12-31 00:45:51.088
WAIT #0: nam='rdbms ipc message' ela= 3006219 timeout=300 p2=0 p3=0 obj#=-1 tim=1388447151086891
*** 2013-12-31 00:45:54.142
WAIT #0: nam='rdbms ipc message' ela= 3005237 timeout=300 p2=0 p3=0 obj#=-1 tim=1388447154140873
*** 2013-12-31 00:45:57.197
WAIT #0: nam='rdbms ipc message' ela= 3005258 timeout=300 p2=0 p3=0 obj#=-1 tim=1388447157195828
*** 2013-12-31 00:46:00.255
WAIT #0: nam='rdbms ipc message' ela= 3005716 timeout=300 p2=0 p3=0 obj#=-1 tim=1388447160253960
53
![Page 54: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/54.jpg)
Database writer, idle
• “rdbms ipc message” indicates a sleep/idle event – There isn’t an indica\on dbw0 writes something:
semtimedop(983043, {{14, -1, 0}}, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
getrusage(RUSAGE_SELF, {ru_utime={0, 31000}, ru_stime={0, 89000}, ...}) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 31000}, ru_stime={0, 89000}, ...}) = 0
times({tms_utime=3, tms_stime=8, tms_cutime=0, tms_cstime=0}) = 431915044
times({tms_utime=3, tms_stime=8, tms_cutime=0, tms_cstime=0}) = 431915044
times({tms_utime=3, tms_stime=8, tms_cutime=0, tms_cstime=0}) = 431915044
semtimedop(983043, {{14, -1, 0}}, 1, {3, 0}) = -1 EAGAIN (Resource temporarily unavailable)
…etc…
54
![Page 55: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/55.jpg)
Database writer, idle
• It does look in the /proc filesystem to the ‘stat’ file of a certain process:
open("/proc/2218/stat", O_RDONLY) = 21
read(21, "2218 (oracle) S 1 2218 2218 0 -1"..., 999) = 209
close(21)
• It does so every 20th \me (20*3)= 60 sec. • The PID is PMON.
55
![Page 56: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/56.jpg)
Database writer, idle
• Recap: – In an idle database. – The dbwr sleeps on a semaphore for 3 seconds.
• Then wakes up, and sets up the semaphore/sleep again. • Processes sleeping on a semaphore do not spend CPU
– Every minute, dbwr reads pmon's process stats. – dbwr doesn’t write if there’s no need.
56
![Page 57: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/57.jpg)
Database writer, force write
• We can force the dbwr to write: – Dirty some blocks (insert a row into a table). – Force a thread checkpoint (alter system checkpoint).
* There are mul\ple ways, this is one of them.
57
![Page 58: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/58.jpg)
Database writer, force write
10046/8 trace: *** 2014-01-03 03:37:47.473
WAIT #0: nam='rdbms ipc message' ela= 3000957 timeout=300 p2=0 p3=0 obj#=-1 tim=1388716667473024
*** 2014-01-03 03:37:49.735
WAIT #0: nam='rdbms ipc message' ela= 2261867 timeout=300 p2=0 p3=0 obj#=-1 tim=1388716669735046
WAIT #0: nam='db file async I/O submit' ela= 0 requests=3 interrupt=0 timeout=0 obj#=-1 tim=1388716669735493
WAIT #0: nam='db file parallel write' ela= 21 requests=1 interrupt=0 timeout=2147483647 obj#=-1 tim=1388716669735566
*** 2014-01-03 03:37:50.465
WAIT #0: nam='rdbms ipc message' ela= 729110 timeout=73 p2=0 p3=0 obj#=-1 tim=1388716670464967
58
elapsed time = 2.26 sec. So the dbwr is posted!
db file async I/O submit?! It looks like the io_submit() call is instrumented for the dbwr!
But what does ‘requests=3’ mean for a single row update checkpoint?
And the write, via the event ‘db file parallel write’.
![Page 59: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/59.jpg)
dbwr, sql_trace + strace
• Let’s take a look at the Oracle wait events, together with the actual system calls.
• That is: – Sehng a 10046/8 event for trace and waits. – Execute strace with ‘-‐e write=all -‐e all’
59
![Page 60: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/60.jpg)
dbwr, sql_trace + straceio_submit(140195085938688, 3, {{0x7f81b622ab10, 0, 1, 0, 256}, {0x7f81b622a8a0, 0, 1, 0, 256}, {0x7f81b622a630, 0, 1, 0, 256}}) = 3
write(13, "WAIT #0: nam='db file async I/O "..., 108) = 108
| 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db |
| 00010 20 66 69 6c 65 20 61 73 79 6e 63 20 49 2f 4f 20 file as ync I/O |
| 00020 73 75 62 6d 69 74 27 20 65 6c 61 3d 20 31 20 72 submit' ela= 1 r |
| 00030 65 71 75 65 73 74 73 3d 33 20 69 6e 74 65 72 72 equests= 3 interr |
| 00040 75 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 30 20 upt=0 ti meout=0 |
| 00050 6f 62 6a 23 3d 2d 31 20 74 69 6d 3d 31 33 38 38 obj#=-1 tim=1388 |
| 00060 39 37 37 36 35 31 38 30 34 32 36 31 97765180 4261 |
io_getevents(140195085938688, 1, 128, {{0x7f81b622ab10, 0x7f81b622ab10, 8192, 0}, {0x7f81b622a8a0, 0x7f81b622a8a0, 8192, 0}, {0x7f81b622a630, 0x7f81b622a630, 8192, 0}}, {600, 0}) = 3
write(13, "WAIT #0: nam='db file parallel w"..., 116) = 116
| 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db |
| 00010 20 66 69 6c 65 20 70 61 72 61 6c 6c 65 6c 20 77 file pa rallel w |
| 00020 72 69 74 65 27 20 65 6c 61 3d 20 35 38 20 72 65 rite' el a= 58 re |
| 00030 71 75 65 73 74 73 3d 31 20 69 6e 74 65 72 72 75 quests=1 interru |
| 00040 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 32 31 34 pt=0 tim eout=214 |
| 00050 37 34 38 33 36 34 37 20 6f 62 6a 23 3d 2d 31 20 7483647 obj#=-1 |
| 00060 74 69 6d 3d 31 33 38 38 39 37 37 36 35 31 38 30 tim=1388 97765180 |
| 00070 34 35 37 39 4579 |
60
![Page 61: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/61.jpg)
dbwr, sql_trace + straceio_submit(140195085938688, 3, {{0x7f81b622ab10, 0, 1, 0, 256}, {0x7f81b622a8a0, 0, 1, 0, 256}, {0x7f81b622a630, 0, 1, 0, 256}}) = 3
write(13, "WAIT #0: nam='db file async I/O "..., 108) = 108
| 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db |
| 00010 20 66 69 6c 65 20 61 73 79 6e 63 20 49 2f 4f 20 file as ync I/O |
| 00020 73 75 62 6d 69 74 27 20 65 6c 61 3d 20 31 20 72 submit' ela= 1 r |
| 00030 65 71 75 65 73 74 73 3d 33 20 69 6e 74 65 72 72 equests= 3 interr |
| 00040 75 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 30 20 upt=0 ti meout=0 |
| 00050 6f 62 6a 23 3d 2d 31 20 74 69 6d 3d 31 33 38 38 obj#=-1 tim=1388 |
| 00060 39 37 37 36 35 31 38 30 34 32 36 31 97765180 4261 |
io_getevents(140195085938688, 1, 128, {{0x7f81b622ab10, 0x7f81b622ab10, 8192, 0}, {0x7f81b622a8a0, 0x7f81b622a8a0, 8192, 0}, {0x7f81b622a630, 0x7f81b622a630, 8192, 0}}, {600, 0}) = 3
write(13, "WAIT #0: nam='db file parallel w"..., 116) = 116
| 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db |
| 00010 20 66 69 6c 65 20 70 61 72 61 6c 6c 65 6c 20 77 file pa rallel w |
| 00020 72 69 74 65 27 20 65 6c 61 3d 20 35 38 20 72 65 rite' el a= 58 re |
| 00030 71 75 65 73 74 73 3d 31 20 69 6e 74 65 72 72 75 quests=1 interru |
| 00040 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 32 31 34 pt=0 tim eout=214 |
| 00050 37 34 38 33 36 34 37 20 6f 62 6a 23 3d 2d 31 20 7483647 obj#=-1 |
| 00060 74 69 6d 3d 31 33 38 38 39 37 37 36 35 31 38 30 tim=1388 97765180 |
| 00070 34 35 37 39 4579 |
61
![Page 62: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/62.jpg)
dbwr, sql_trace + straceio_submit(140195085938688, 3, {{0x7f81b622ab10, 0, 1, 0, 256}, {0x7f81b622a8a0, 0, 1, 0, 256}, {0x7f81b622a630, 0, 1, 0, 256}}) = 3
write(13, "WAIT #0: nam='db file async I/O "..., 108) = 108
| 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db |
| 00010 20 66 69 6c 65 20 61 73 79 6e 63 20 49 2f 4f 20 file as ync I/O |
| 00020 73 75 62 6d 69 74 27 20 65 6c 61 3d 20 31 20 72 submit' ela= 1 r |
| 00030 65 71 75 65 73 74 73 3d 33 20 69 6e 74 65 72 72 equests= 3 interr |
| 00040 75 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 30 20 upt=0 ti meout=0 |
| 00050 6f 62 6a 23 3d 2d 31 20 74 69 6d 3d 31 33 38 38 obj#=-1 tim=1388 |
| 00060 39 37 37 36 35 31 38 30 34 32 36 31 97765180 4261 |
io_getevents(140195085938688, 1, 128, {{0x7f81b622ab10, 0x7f81b622ab10, 8192, 0}, {0x7f81b622a8a0, 0x7f81b622a8a0, 8192, 0}, {0x7f81b622a630, 0x7f81b622a630, 8192, 0}}, {600, 0}) = 3
write(13, "WAIT #0: nam='db file parallel w"..., 116) = 116
| 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db |
| 00010 20 66 69 6c 65 20 70 61 72 61 6c 6c 65 6c 20 77 file pa rallel w |
| 00020 72 69 74 65 27 20 65 6c 61 3d 20 35 38 20 72 65 rite' el a= 58 re |
| 00030 71 75 65 73 74 73 3d 31 20 69 6e 74 65 72 72 75 quests=1 interru |
| 00040 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 32 31 34 pt=0 tim eout=214 |
| 00050 37 34 38 33 36 34 37 20 6f 62 6a 23 3d 2d 31 20 7483647 obj#=-1 |
| 00060 74 69 6d 3d 31 33 38 38 39 37 37 36 35 31 38 30 tim=1388 97765180 |
| 00070 34 35 37 39 4579 |
62
This is the MINIMAL number of requests to reap before successful. (min_nr - see man io_getevents) ?
?
?
The timeout for io_getevents() is set to 600 seconds. struct timespec { sec, nsec }
Despite only needing 1 request, this call returned all 3. This information is NOT EXTERNALISED (!!)
![Page 63: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/63.jpg)
dbwr, db file async I/O submit
• Let’s take a look at the what the documenta\on says about “db file async I/O submit”:
63
(That’s right…nothing)
![Page 64: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/64.jpg)
dbwr, db file async I/O submit
• My Oracle Support on “db file async I/O submit”:
'db file async I/O submit' when FILESYSTEMIO_OPTIONS=NONE
[Article ID 1274737.1]
How To Address High Wait Times for the 'Direct Path Write Temp ' Wait Event
[Article ID 1576956.1]
• Both don’t describe what this event is. • 1st note is only for filesystemio_op\ons=NONE and describes the event not being tracked prior to version 11.2.0.2.
64
![Page 65: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/65.jpg)
dbwr, db file async I/O submit
• So the ques\on is:
– What DOES the event “db file async I/O submit” mean?
65
• The obvious answer is: – Instrumenta\on of the io_submit() call.
• My answer is: – Don’t know. – But NOT the instrumenta\on of io_submit().
![Page 66: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/66.jpg)
dbwr, db file async I/O submit
• This is a trace of the relevant C func\ons:
kslwtbctx
kslwtectx -- Previous wait time: 236317: rdbms ipc message
io_submit - 3,45e5a000 - nr,ctx
kslwtbctx
kslwtectx -- Previous wait time: 688: db file async I/O submit
kslwtbctx
io_getevents - 1,45e5a000 - minnr,ctx,timeout: $3 = {tv_sec = 600, tv_nsec = 0}
skgfr_return64 - 3 IOs returned
kslwtectx -- Previous wait time: 9604: db file parallel write
66
Waiting on a semaphore to be posted.
io_submit() for 3 IOs
The begin of the wait starts AFTER the io_submit()?
io_getevevents() is properly timed. min_nr=1, got 3 IOs
![Page 67: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/67.jpg)
dbwr, db file async I/O submit
• Trace with sleep 0.1 in the break on io_submit() kslwtbctx
kslwtectx -- Previous wait time: 385794: rdbms ipc message
io_submit - 3,45e5a000 - nr,ctx
> sleep 0.1
kslwtbctx
kslwtectx -- Previous wait time: 428: db file async I/O submit
kslwtbctx
io_getevents - 1,45e5a000 - minnr,ctx,timeout: $37 = {tv_sec = 600, tv_nsec = 0}
skgfr_return64 - 3 IOs returned
kslwtectx -- Previous wait time: 8053: db file parallel write
67
Waiting on a semaphore to be posted.
io_submit() for 3 IOs + sleep of 100’000
Wait time too low. io_submit() is not timed.
![Page 68: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/68.jpg)
dbwr, db file async I/O submit
• Trace, version 12.1.0.2 semtimedop semid:688133, timeout:3,0
kskthewt (8)
io_submit ctx:8ce04000, nr:61
kskthbwt (157)
kskthewt (157)
kskthbwt (156)
io_getevents ctx:8ce04000, min_nr:4, nr:128, timeout 600,0
kskthewt (156)
semctl semid:688133, semnum:17, cmd:16
kskthbwt (8)
semtimedop semid:688133, timeout:3,0
68
io_submit still is outside of the wait.
![Page 69: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/69.jpg)
dbwr, db file parallel write
• Let’s look at the “db file parallel write” event.
69
![Page 70: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/70.jpg)
dbwr, db file parallel write
• Descrip\on from the Reference Guide:
db file parallel write
This event occurs in the DBWR. It indicates that the DBWR is performing a parallel write to files and blocks. When the last I/O has gone to disk, the wait ends.
Wait Time: Wait until all of the I/Os are completed
Parameter Description requests: This indicates the total number of I/O requests, which will be the same as blocks interrupt: timeout: This indicates the timeout value in hundredths of a second to wait for the I/O completion.
70
Correct Correct…but only if AIO is enabled.
Incorrect
Incorrect
Incorrect
Empty?
Probably incorrect. Or does a timeout of
2’147’483’647 /100/60/60/24=
248.55 days Make sense to
*anybody*?
![Page 71: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/71.jpg)
dbwr, db file parallel write
• Recap of previous traced calls:
kslwtbctx
kslwtectx -- Previous wait time: 236317: rdbms ipc message
io_submit - 3,45e5a000 - nr,ctx
kslwtbctx
kslwtectx -- Previous wait time: 688: db file async I/O submit
kslwtbctx
io_getevents - 1,45e5a000 - minnr,ctx,timeout: $3 = {tv_sec = 600, tv_nsec = 0}
skgfr_return64 - 3 IOs returned
kslwtectx -- Previous wait time: 9604: db file parallel write
71
So….how about severely limiting OS IO capacity and see what happens?
![Page 72: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/72.jpg)
dbwr, db file parallel write
• Database writer — severely limited IO ( 1 IOPS) io_submit - 366,45e5a000 - nr,ctx
kslwtbctx
kslwtectx -- Previous wait time: 1070: db file async I/O submit
kslwtbctx
io_getevents - 100,45e5a000 - minnr,ctx,timeout: $7 = {tv_sec = 600, tv_nsec = 0}
skgfr_return64 - 100 IOs returned
kslwtectx -- Previous wait time: 109334845: db file parallel write
io_getevents - 128,45e5a000 - minnr,ctx,timeout: $8 = {tv_sec = 0, tv_nsec = 0}
io_getevents - 128,45e5a000 - minnr,ctx,timeout: $9 = {tv_sec = 0, tv_nsec = 0}
io_submit - 73,45e5a000 - nr,ctx
kslwtbctx
kslwtectx -- Previous wait time: 486: db file async I/O submit
72
366 IO requests are submitted onto the OS.
But only 100 IOs are needed to satisfy io_getevents() Which it does in this case… leaving outstanding IOs
The dbwr starts issuing non-blocking calls to reap IOs! It seems to be always 2 if outstanding IOs remain.
Minnr = # outstanding IOs, max 128.
![Page 73: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/73.jpg)
dbwr, db file parallel write
• This got me thinking… • The dbwr submits the IOs it needs to write.
• But it waits for a variable amount of IOs to finish. – Wait event ‘db file parallel write’. – Amount seems 33-‐25% of submiIed IOs* – Aver that, 2 tries to reap the remaining IOs* – Then either submit again, DFPW un\l IOs reaped or back to sleeping on semaphore.
73
![Page 74: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/74.jpg)
dbwr, db file parallel write
• This means ‘db file parallel write’ is not: – Physical IO indicator. – IO latency \ming
• I’ve come to the conclusion that the blocking io_getevents call for a number of IOs of the dbwr is an IO limiter/throIle.
• …and ’db file parallel write’ is the \ming of it.
74
![Page 75: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/75.jpg)
dbwr, db file parallel write
• This implementa\on did not change with v 12!
• However, I only inves\gated ASM. – …but peeked at the filesystem implementa\on.
– The next slide shows how that looks like:
75
![Page 76: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/76.jpg)
dbwr, db file parallel write
• Database writer -‐ filesystem semtimedop semid:557061, timeout:3,0
kskthewt (8)
kskthbwt (157)
io_submit ctx:ebd26000, nr:96
kskthewt (157)
kskthbwt (156)
io_getevents ctx:ebd26000, min_nr:6, nr:128, timeout 600,0
kskthewt (156)
io_getevents ctx:ebd26000, min_nr:90, nr:128, timeout 0,0
kskthbwt (156)
io_getevents ctx:ebd26000, min_nr:6, nr:128, timeout 600,0
kskthewt (156)
io_getevents ctx:ebd26000, min_nr:84, nr:128, timeout 0,0
76
Sleeping on a semaphore
96 IOs are submitted
io_getevents: inside wait: wait for 6 / 96
io_getevents:non-blocking get the remainder
io_getevents: inside wait: wait for 6 / 96
…etc…
do you spot the oddity here??
The io_submit() system call is properly instrumented!
![Page 77: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/77.jpg)
dbwr, synchronous IO
• Let’s turn AIO off again. – To simulate this, I’ve set disk_asynch_io to FALSE.
• And set a 10046/8 trace and strace on the dbwr. • And issue the SQLs as before:
– insert into followed by commit – alter system checkpoint
77
![Page 78: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/78.jpg)
dbwr, synchronous IOpwrite(256, "\6\242\0\0\207\261\0\0hG\17\0\0\0\2\0063R\0\0\1\0\0\0009U\1\0[G\17\0"..., 8192, 8053121024) = 8192 pwrite(256, "\2\242\0\0\246\0\300\0hG\17\0\0\0\1\4\220T\0\0\3\0\r\0j\3\0\0\203\0\17\17"..., 8192, 7443103744) = 8192 pwrite(256, "&\242\0\0\240\0\300\0hG\17\0\0\0\2\4\372\216\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 7443054592) = 8192 write(11, "WAIT #0: nam='db file parallel w"..., 107) = 107 | 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db | | 00010 20 66 69 6c 65 20 70 61 72 61 6c 6c 65 6c 20 77 file pa rallel w | | 00020 72 69 74 65 27 20 65 6c 61 3d 20 32 30 20 72 65 rite' el a= 20 re | | 00030 71 75 65 73 74 73 3d 33 20 69 6e 74 65 72 72 75 quests=3 interru | | 00040 70 74 3d 30 20 74 69 6d 65 6f 75 74 3d 30 20 6f pt=0 tim eout=0 o | | 00050 62 6a 23 3d 2d 31 20 74 69 6d 3d 31 33 38 39 38 bj#=-1 t im=13898 | | 00060 30 32 32 38 31 34 35 36 37 34 30 02281456 740 | write(11, "WAIT #0: nam='db file parallel w"..., 106) = 106 | 00000 57 41 49 54 20 23 30 3a 20 6e 61 6d 3d 27 64 62 WAIT #0: nam='db | | 00010 20 66 69 6c 65 20 70 61 72 61 6c 6c 65 6c 20 77 file pa rallel w | | 00020 72 69 74 65 27 20 65 6c 61 3d 20 31 20 72 65 71 rite' el a= 1 req | | 00030 75 65 73 74 73 3d 33 20 69 6e 74 65 72 72 75 70 uests=3 interrup | | 00040 74 3d 30 20 74 69 6d 65 6f 75 74 3d 30 20 6f 62 t=0 time out=0 ob | | 00050 6a 23 3d 2d 31 20 74 69 6d 3d 31 33 38 39 38 30 j#=-1 ti m=138980 | | 00060 32 32 38 31 34 35 38 34 39 32 22814584 92 |
78
3 pwrite() calls. This is synchronous IO!
The db file parallel write wait event shows 3 requests!
But why a second db file parallel write wait event?
![Page 79: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/79.jpg)
dbwr, synchronous IO
• There’s no ‘db file async I/O submit’ wait anymore. – Which is good, because SIO has no submit phase.
• The ‘db file parallel write’ waits seem suspicious. – It seems like the wait for DFPW is issued twice. – Further inves\ga\on shows that it does.
• My guess this is a bug in the sync. IO implementa\on.
• Let’s look a level deeper and see if there’s more to see.
79
![Page 80: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/80.jpg)
dbwr, synchronous IOkslwtbctx semtimedop - 458755 semid, timeout: $18 = {tv_sec = 3, tv_nsec = 0} kslwtectx -- Previous wait time: 1239214: rdbms ipc message
pwrite64 - fd, size - 256,8192 pwrite64 - fd, size - 256,8192 pwrite64 - fd, size - 256,8192
kslwtbctx kslwtectx -- Previous wait time: 949: db file parallel write
kslwtbctx kslwtectx -- Previous wait time: 650: db file parallel write
kslwtbctx semtimedop - 458755 semid, timeout: $19 = {tv_sec = 1, tv_nsec = 620000000}
80
This is clearly the semaphore being posted: timeout=3s, wait time = 1239,2ms
3 IO’s in serial using pwrite(). The only possibility if there isn’t AIO of course.
Two db file parallel write (which aren’t parallel) for which both the begin of the waits are started AFTER the IO (!!)
After the IOs are done, the dbwr continues sleeping.
![Page 81: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/81.jpg)
dbwr, synchronous IO
• Let’s do the same trick as done earlier: – In gdb, add “shell sleep 0.1” to the pwrite call. – This makes the execu\on of this call take 100ms longer. – To see if there’s s\ll some way Oracle \mes it properly.
81
![Page 82: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/82.jpg)
dbwr, synchronous IOkslwtbctx semtimedop - 458755 semid, timeout: $23 = {tv_sec = 3, tv_nsec = 0} kslwtectx -- Previous wait time: 92080: rdbms ipc message
pwrite64 - fd, size - 256,8192 > shell sleep 0.1 pwrite64 - fd, size - 256,8192 > shell sleep 0.1 pwrite64 - fd, size - 256,8192 > shell sleep 0.1
kslwtbctx kslwtectx -- Previous wait time: 478: db file parallel write
kslwtbctx kslwtectx -- Previous wait time: 495: db file parallel write
kslwtbctx semtimedop - 458755 semid, timeout: $24 = {tv_sec = 2, tv_nsec = 460000000}
82
The 3 IOs again, each sleeps in pwrite() for 100ms (0.1s)
Yet the ‘db file parallel write’ wait shows a waiting time of 478; which is 0.478ms: the timing is wrong.
![Page 83: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/83.jpg)
dbwr, synchronous IO
• So, my conclusion on the wait events for the dbwr with synchronous IO: – The events are not properly \med – It seems like the wait for DFPW is issued twice. – Further inves\ga\on shows that it does.
• My guess this is a bug in the sync. IO implementa\on.
83
![Page 84: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/84.jpg)
dbwr, synchronous IO -‐ 12c
• Star\ng from 12.1.0.2 the implementa\on changed! semtimedop semid:819205, timeout:3,0 kskthewt (8)
kskthbwt (156) pwrite64 fd:256, buf:93425000, size:65536, offset:2006130688 pwrite64 fd:256, buf:93401000, size:106496, offset:2006016000 pwrite64 fd:256, buf:85a18000, size:8192, offset:1191026688 pwrite64 fd:256, buf:84550000, size:8192, offset:956989440 kskthewt (156)
kskthbwt (156) kskthewt (156)
semctl semid:819205, semnum:17, cmd:16
kskthbwt (8) semtimedop semid:819205, timeout:3,0
84
Sleeping on semaphore.
Wait starts BEFORE write’s are issued!
And end’s AFTER the write’s are issued!
![Page 85: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/85.jpg)
Databasewriter -‐ ASM -‐ sync IO
85
wait event db file paralle write
pread() system call
11.2.0.1
11.2.0.2
11.2.0.3
11.2.0.4
log file parallel write
12.1.0.1
12.1.0.2
![Page 86: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/86.jpg)
dbwr: exadata
• How does this look like on Exadata?
86
![Page 87: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/87.jpg)
dbwr: exadata including 12ckslwtbctx semtimedop - 3309577 semid, timeout: $389 = {tv_sec = 3, tv_nsec = 0} kslwtectx -- Previous wait time: 1266041: rdbms ipc message $390 = "oss_write" $391 = "oss_write" $392 = "oss_write" $393 = "oss_write" $394 = "oss_write" $395 = "oss_write"
kslwtbctx kslwtectx -- Previous wait time: 684: db file async I/O submit
kslwtbctx $396 = "oss_wait" $397 = "oss_wait" kslwtectx -- Previous wait time: 2001: db file parallel write $398 = "oss_wait" $399 = "oss_wait" $400 = "oss_wait" $401 = "oss_wait"
semctl - 3309577,23,16 - semid, semnum, cmd kslwtbctx semtimedop - 3309577 semid, timeout: $402 = {tv_sec = 1, tv_nsec = 630000000} kslwtectx -- Previous wait time: 1634299: rdbms ipc message
87
The dbwr semaphore sleep.
The writes are issued here. This is not timed.
There is the db file async I/O submit. Again, it doesn’t seem to time any of the typical IO calls!
And there we got the db file parallel write. It does seem to always time two oss_wait() calls…
But it looks for more IOs to finish, alike the trailing io_getevents() calls. I am quite sure oss_wait() is a blocking call…
![Page 88: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/88.jpg)
Conclusion
• Logwriter: – When idle, is sleeping on a semaphore/rdbms ipc message – Gets posted with semctl() to do work. – Only writes when it needs to do so. – Version 11.2.0.3: two methods for pos\ng FGs:
– Polling and post/wait. – Post/wait is default, might switch to polling. – No\fica\on of switch is in log writer trace file. – Polling/nanosleep() \me is variable.
88
![Page 89: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/89.jpg)
Conclusion
• Logwriter: – Log file parallel write
– AIO: two io_getevents() calls. – AIO: \me wai\ng for all lgwr submiIed IOs to finish.
– Not IO latency 4me! – SIO: does not do parallel writes, but serial. – SIO: does not \me IO.
89
![Page 90: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/90.jpg)
Conclusion
• Logwriter: – Wait event IO \ming:
– All the ‘* parallel read’ and ‘* parallel write’ events do not seem to \me IO correctly with synchronous IO*.
– All the events which cover single block IOs do use synchronous IO calls, even with asynchronous IO set.
– Logwriter writes a warning when IO \me and SCN broadcast ack \me exceeds 500ms in the log writer trace file.
– _disable_logging *only* disables write to logs.
90
![Page 91: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/91.jpg)
Conclusion
• Database writer: – When idle, is sleeping on a semaphore/rdbms ipc message – Gets posted with semctl() to do work. – Only writes when it needs to do so. – Since version 11.2.0.2, event ‘db file async I/O submit’:
– Is not shown with synchronous I/O. – Shows the actual amount of IOs submiIed. – Does not \me io_submit() – Unknown what or if it \mes something.
91
![Page 92: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/92.jpg)
Conclusion
• Database writer: – Event ‘db file parallel write’:
– Shows the minimal number io_getevents() waits for. – The number of requests it waits for varies, but mostly seems to be ~ 25-‐33% of submiIed IOs.
– Aver the \med, blocking, io_getevents() call, it issues two non-‐blocking io_getevents() calls for the remaining non-‐reaped IOs, if any.
–My current idea is the blocking io_getevents() call is an IO throIle mechanism.
92
![Page 93: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/93.jpg)
Conclusion
• Database writer: – Event ‘db file parallel write’, with synchronous IO:
– pwrite64() calls are issued serially. – These are not \med*.
– The event is triggered twice.
– On exadata, two out of the total number of oss_wait() calls are \med with the event ‘db file parallel write’.
93
![Page 94: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/94.jpg)
Q & A
94
![Page 95: profiling the logwriter and database writer - · PDF file©2013Enkitec& Profiling&the&logwriter&and& database&writer& Frits&Hoogland& UKOUGTech&2014 1 This is the font size used for](https://reader030.vdocuments.site/reader030/viewer/2022020411/5aa2e9cb7f8b9ada698d8482/html5/thumbnails/95.jpg)
Thanks & Links
95
• Enkitec • Tanel Poder, Mar\n Bach, Klaas-‐Jan Jongsma, Jeremy Schneider, Karl Arao, Michael Fontana, Luca Canali.
• hIp://www.pythian.com/blog/adap\ve-‐log-‐file-‐sync-‐oracle-‐please-‐dont-‐do-‐that-‐again/
• hIp://files.e2sn.com/slides/Tanel_Poder_log_file_sync.pdf