con1741 mcintosh top 10 database performance tips for sparc systems running oracle solaris
TRANSCRIPT
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 121
Top 10 Database Performance Tips for SPARC Systems Running Oracle SolarisLawrence McIntosh and Roger BitarOracle Optimized Solutions, Systems Group
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 123
Program Agenda
Background
Motivation
Methodology
Oracle Optimized Solutions
Areas covered
Tools and techniques
Takeaways
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 124
The following is intended to outline our general product direction. It is
intended for information purposes only, and may not be incorporated
into any contract. It is not a commitment to deliver any material,
code, or functionality, and should not be relied upon in making
purchasing decisions. The development, release, and timing of any
features or functionality described for Oracle’s products remains at
the sole discretion of Oracle.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 125
Background
Evolution of SPARC Systems and Oracle Solaris technologies :– SPARC T5 / M5 Systems
– MPO, ZFS, Zones, OVM, Critical Thread support, etc.
– ZFSSA, Exadata Storage, SAN Storage Flash Technologies
– Infiniband (IB) based Interconnect technologies
– Oracle SuperCluster
– Oracle Optimized Solutions
Performance tips are based on internal performance analysis and optimizations over many years on a variety of Systems and Database workloads
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 126
Motivation
Develop Well Balanced Systems that are highly optimized Highlight lesser known facts about performance issues Discuss techniques for diagnosing performance degradation,
variance, etc. Highlight tips for improving Database performance on SPARC
Systems running Oracle Solaris Highlight the benefit of live monitoring to diagnose systemic issues Encourage use of Oracle Solaris tools to correlate various discrete
events
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 127
Methodology Apply known “Best Practices” Follow Performance optimization process
Performance below expectation, variance, degradation over time, etc.
Optimization Diagnosis
Symptoms
Systemic Analysis Tuning Tips
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 128
Sizing and configuration optimizations
Load/stress tests
Performance and scalability
tests
Real world workload
tests
Patch regression
tests
Faultinjection
tests
Optimizations Across the Development CycleEngineered, Tested, and Proven from Apps-to-Disk
Interoperability tests
End to end functional validation
Early development
tests
Identify integration
opportunities
Full Stack OptimizationsOne Engineering Team
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 129
Observability
System performance : Solaris + Platform – Process/thread analysis : lockstat, plockstat, truss, prstat and pstack
– Memory Placement Optimization : plgrp, pmap, lgrpinfo
– Interrupt assignment and load balancing : mpstat, mdb, intrstat
– Hardware utilization and capacity planning : cpustat, corestat, pgstat, pginfo
– Advanced level of debugging : Dtrace
Correlate data from different tools to identify systemic issues
– For e.g. high GC latencies <-> mpstat output <-> prstat output
– Can lead us to scheduler issues or interrupt saturation, etc
Tools/Techniques
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1210
Profiling techniques
DTrace – Systemic analysis : End to end user to kernel to user
– Drill-down analysis : e.g. High syscalls, which system calls, who issues it , what are the callstacks, arguments to the calls etc.
– Speculative analysis
– Proc, Sched and IO providers
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1211
Areas Covered
Consolidation RAC performance Memory allocation CPU/Memory affinity Scheduling for better response time I/O performance Systems approach to Cloning
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1212
Consolidation Best Practices
Optimize utilization of shared Hardware resources– Share Oracle distribution, common ORACLE_HOME
Improve Instance Caging – Leverage Solaris resource management framework (Oracle 11.2.0.3)
– Enable processor set aware lgroup memory allocation
Consider Core boundary for optimal Logical Domain configuration(T-series) Oracle Solaris 11.1 release provides better Virtualization technologies (RDS V3 support in Zones, IO Virtualization, etc.) for DB consolidation
In /etc/system set lgrp_mem_pset_aware = 1
# Configure Resource Pools: psets/Sharesset processor_group_name = “PROCESSOR_GROUP_NAME” in init.ora
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1213
RAC Performance
Symptom : – High “cluster wait time” resulting in high transaction response time (AWR)
Diagnosis : – Monitor CPU Utilization of LMS processes (prstat)
– Monitor intrstat for CPU’s handling Network interrupts (requires privileges)
Tunings: – Run LMS in FX60, Do not over-configure number of LMS processes
– UDP tunables – Follow best practices for Network tuning.
High Response Time
# prstat from saturated LMSPID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID4816 oracle 70 21 0.5 0.0 0.0 0.0 8.1 0.7 10K 6K . 3M 6 oracle/1# prstat from “good” LMSPID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID4396 oracle 31 11 0.4 0.0 0.0 0.0 56 1.2 39K 4K 2M 6 oracle/1
In /etc/system set ip:ip_squeue_bind = 0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1214
Memory Allocation
Solaris Memory Organization (Size) Supports multiple pagesizes for Database instance(s) Tries to allocate SGA using largest available pagesize Coalesces smaller pages into larger pages as required
Solaris Memory Organization (Placement) CPU-Memory affinity abstracted as locality groups (lgroup) lgroup aware memory allocation
Database instance startup time depends on availability of large pages
Oracle Solaris Concepts
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1215
Memory Allocation (SGA)
DB instance startup time depends on shared memory allocation time Oracle Solaris parallelizes allocation using kernel threads (VMTASKS)
– Default maximum number of vmtasks limited to 16
– On larger systems, vmtasks limit can be increased for better parallelism
– For example, add the following line to /etc/system
– or modify on a live system using mdb
Set vmtask_ntasks_max parameter to 10-20% of available cpus
Improving DB Startup Time by faster SGA Allocation
set vmtask_ntasks_max = 0x20
# echo “vmtask_ntasks_max /W 0x20” | /bin/mdb -kw
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1216
Memory Allocation (SGA)
Effect of vmtasks on SGA allocation time
Improving DB Startup Time by faster SGA Allocation (contd.)
0
20
40
60
80
100
120
140
8 16 24 32 48 64 80
SGA Allocation Time (secs) vs #vmtasksT3-4 (450GB SGA)
#vmtasks
0
20
40
60
80
100
120
140
16 32 48 64 80 96 112 128 144
SGA Allocation Time (secs) vs #vmtasksM9000-64 (900GB SGA)
#vmtasks
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1217
Memory Allocation (SGA)
Comparing SGA allocation time for different SGA sizes
Improving DB Startup Time by faster SGA Allocation (contd.)
0
10
20
30
40
50
60
70
32 54 128 256 450
T3-4 SGA Allocation Times (secs)
vs.SGA Size
#vmtasks=16#vmtasks=64
0
20
40
60
80
100
120
140
32 54 128 256 512 900
M9000-64SGA Allocation Times (secs)
vsSGA Size
#vmtasks=16#vmtasks=64
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1218
Memory Allocation (SGA)
Symptoms : Slow Database instance startup
Diagnosis : Memory fragmentation due to several database startups/shutdowns, i/o to
file systems, other types of i/o, etc. Mixed size pages due to page coalescing issues Monitor pagesizes using “ pmap –xs `pgrep –f ora_dbw0` ”
Slow DB Startup due to large-page availability
Address Kbytes RSS Anon Locked Pgsz Mode Mapped File0000000380000000 262144 262144 - 262144 256M rwxsR [ ism shmid=0xd ]0000000390000000 131072 131072 - 131072 4M rwxsR [ ism shmid=0xd ]0000000400000000 56623104 56623104 - 56623104 2G rwxsR [ ism shmid=0xe ]0000001180000000 24 24 - 24 8K rwxsR [ ism shmid=0xf ]
Address Kbytes RSS Anon Locked Pgsz Mode Mapped File0000000380000000 262144 262144 - 262144 256M rwxsR [ ism shmid=0x200005d ]0000000390000000 131072 131072 - 131072 4M rwxsR [ ism shmid=0x200005d ]0000000400000000 52428800 52428800 - 52428800 2G rwxsR [ ism shmid=0x200005e ]0000001080000000 4194304 4194304 - 4194304 256M rwxsR [ ism shmid=0x200005e ]0000001180000000 24 24 - 24 8K rwxsR [ ism shmid=0x200005f ]
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1219
Memory Allocation (SGA)
Suggested Tips :– Limit ZFS Cache size to 20% of available memory
Add the following to /etc/system (e.g. to set it to 2GB) (requires reboot)
If ZFS is used heavily, refer to ZFS tuning guide for recommendation
– Consider system reboot as a last resort
Next Generation Database addresses some of these concerns– Helps improve Instance Startup
– Helps Database Availability
Slow DB Startup due to large-page availability (contd.)
set zfs:zfs_arc_max = 2147483648
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1220
Memory Allocation (SGA)
Symptoms : – 5% to 10% performance variance for OLTP
– Multiple DB instances started over time on the same Solaris instance
Diagnosis:– Monitor SGA allocation across lgroups using “pmap –L”
Performance Variance due to uneven memory allocation across lgroups
# pmap –Ls `pgrep –f ora_dbw0` | egrep “Address|shmid”
Address Bytes Pgsz Mode Lgrp Mapped File 0000000380000000 262144K 256M rwxsR 3 [ ism shmid=0x1 ]0000000390000000 786432K 256M rwxsR 2 [ ism shmid=0x1 ]00000003C0000000 262144K 256M rwxsR 3 [ ism shmid=0x1 ]00000003D0000000 262144K 256M rwxsR 4 [ ism shmid=0x1 ]00000003E0000000 262144K 256M rwxsR 1 [ ism shmid=0x1 ]00000003F0000000 1048576K 256M rwxsR 3 [ ism shmid=0x1 ]..
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1221
Memory Allocation (SGA)
– Monitor memory allocations across lgroups using “lgrpinfo –cm”
Tips: – Use lgrpinfo (Solaris S10U9 onwards) before and after Database startup
– Isolate Database instance(s) and/or Applications if possible
Uneven memory allocation across lgroups (contd.)
# lgrpinfo -cmlgroup 0 (root): CPUs: 0-255 Memory: installed 512G, allocated 256G, free 256Glgroup 1 (leaf): CPUs: 0-63 Memory: installed 128G, allocated 84G, free 44Glgroup 2 (leaf): CPUs: 64-127 Memory: installed 128G, allocated 44G, free 84Glgroup 3 (leaf): CPUs: 128-191 Memory: installed 128G, allocated 63G, free 65Glgroup 4 (leaf): CPUs: 192-255 Memory: installed 128G, allocated 65G, free 63G
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1222
Memory Allocation (SGA/PGA)
Symptom : – 5% to 10% performance variance for OLTP workloads
– Performance variance seen in DSS Parallel Queries
Diagnosis : Monitor pagesizes using “pmap –xs”
– Large Page Availability for PGA PGA memory is dynamically allocated/freed - may not always get same
number of large pages
Performance variance due to insufficient large pages
Address Kbytes RSS Anon Locked Pgsz Mode Mapped File0000000380000000 262144 262144 - 262144 256M rwxsR [ ism shmid=0xd ]0000000390000000 131072 131072 - 131072 4M rwxsR [ ism shmid=0xd ]0000000400000000 56623104 56623104 - 56623104 2G rwxsR [ ism shmid=0xe ]0000001180000000 24 24 - 24 8K rwxsR [ ism shmid=0xf ] Address Kbytes RSS Anon Locked Pgsz Mode Mapped File0000000380000000 262144 262144 - 262144 256M rwxsR [ ism shmid=0x200005d ]0000000390000000 131072 131072 - 131072 4M rwxsR [ ism shmid=0x200005d ]0000000400000000 52428800 52428800 - 52428800 2G rwxsR [ ism shmid=0x200005e ]0000001080000000 4194304 4194304 - 4194304 256M rwxsR [ ism shmid=0x200005e ]0000001180000000 24 24 - 24 8K rwxsR [ ism shmid=0x200005f ]
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1223
CPU Utilization
Uneven CPU Utilization– Dynamic creation/deletion of processor sets, onlining / offlining of CPUs
– Short lived applications creating load imbalance
Diagnosis:– Monitor load average using kstat , correlate with mpstat
– Monitor uneven home lgroup assignment using plgrp
Tip : Create controlled environment with appropriate affinity grouping
Performance variance due to load imbalance
# kstat -m lgrp -s "load average" -p 10
lgrp:1:lgrp1:load average 60249lgrp:2:lgrp2:load average 30075lgrp:3:lgrp3:load average 50572lgrp:4:lgrp4:load average 37336
# plgrp -a all $$ PID/LWPID HOME AFFINITY 1610/1 1 0-4/none# plgrp -A 2/strong $$ PID/LWPID HOME AFFINITY 1610/1 1 => 2 2/none => 2/strong# plgrp -a all $$ PID/LWPID HOME AFFINITY 1610/1 2 2/strong,0,1,3,4/none
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1224
Process Scheduling
Scheduling Classes TS, FX, RT Dynamically modified using priocntl (requires privileges)
MPO aware load balancing Balanced home lgroup assignment and lgroup aware scheduler
CMT aware scheduling Critical Thread optimization (available in Solaris 11, Solaris 10U10)
Dynamically provides exclusive access to shared hardware resources Enabled by running critical software processes/threads at FX 60
Observability proc tools (plgrp, prstat)
Oracle Solaris Concepts
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1225
Process Scheduling
OLTP Applications can benefit from Critical Thread optimization Run logwriter and LMS processes at FX 60 (requires privileges)
Monitor critical processes using ps -c, prstat -cvmL– USR/SYS columns for reduction in cpu utilization
– ICX column for involuntary context switches
Response time improvement
# priocntl –c FX -m 60 -p 60 -s `/usr/bin/pgrep -f ora_lgwr`# priocntl –c FX -m 60 -p 60 -s `/usr/bin/pgrep -f ora_lms`
# prstat -cvmL PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 1421 oracle 24 28 0.2 0.0 0.0 0.0 44 3.6 81K 4K 86K 0 oracle/1
# prstat -cvmL PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 1421 oracle 22 24 0.2 0.0 0.0 0.0 52 3.0 92K 3K 98K 0 oracle/1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1226
Process Scheduling
Use FX for foregrounds (OLTP) and PQ Slaves (DSS)
Typically, up to a 5% improvement with FX over TS Apply with caution for systems running mixed workloads (OLTP, DSS, Batch, Others)
FX scheduling for response time improvement
88 90 92 94 96 98 100 102 104 106
DSS Response Time
OLTP Response Time
OLTP Throughput
TS FX Relative Performance
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1227
Log File Sync
Symptom:– OLTP transactions showing high response time
– Throughput not increasing with more load (in spite of idle CPU)
– “log file sync” in the Top Wait Events (AWR)
– “log file parallel write” may or may not be too high (AWR)
Diagnosis: – Monitor CPU utilization of lgwr process (prstat )
– Watch out for high %sys time for lgwr process (prstat)
– Watch out for high scheduling latency for lgwr process (prstat)
High Response Time
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1228
Log File Sync
Tunings:– Reduce semaphore contention
Lower processes per semaphore set (process.max-sem-nsems) upto 64 Use projadd/projmod to add entry to /etc/project or prctl to dynamically change
Large improvements to lgwr efficiency, thereby “log file sync” time Typically, upto a 10% improvement in overall throughput
High Response Time (contd.)
# projadd -U oracle -K "process.max-sem-nsems=(priv,64,deny)" user.oracle# projmod –a -K "process.max-sem-nsems=(priv,64,deny)" user.oracle# prctl -n process.max-sem-nsems -r -v 64 -i process <PID>
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1229
Log File Sync
Tuning : – Improve LGWR efficiency and scheduling latency
Run LGWR in FX60
Provide LGWR exclusive access to shared H/W resources
Consider run-time scenario such as multiple DB instances, available cores/CPUs for the instance before applying tip
High Response Time (contd.)
# #Create Processor Set # psrset –c 56-63# #Turn off all but one CPU in the processor set# psradm –f 57-63# #Bind the lgwr to the processor set# psrset –b 1 `pgrep –f ora_lgwr`# #Mark the CPU as non-interruptible# psrset –f 56
# priocntl –c FX -m 60 -p 60 -s `/usr/bin/pgrep -f ora_lgwr`
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1230
Log File Sync
Can expect significant reduction in log file sync time depending on the configuration such as number of CPUs, log device etc.
– prstat data with lgwr in default TS scheduling class
– prstat data after moving lgwr to FX60
– Reduced lgwr cpu utilization by >10%
– Improved “log file sync” time by >10%
High Response Time (contd.)
# prstat -cvmL PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 1421 oracle 24 28 0.2 0.0 0.0 0.0 44 3.6 81K 4K 86K 0 oracle/1
# prstat -cvmL PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 1421 oracle 22 24 0.2 0.0 0.0 0.0 52 3.0 92K 3K 98K 0 oracle/1
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1231
I/O Performance
Symptoms : – OLTP transactions showing high response time
– “db file sequential/parallel read “ in the Top Wait Events (AWR)
Diagnosis : – Monitor intrstat for CPU’s handling I/O interrupts (requires privileges)
– Monitor mpstat for Interrupt CPU saturation
High I/O response time
# intrstat –c 15 5
# #intrstat,mpstat outputs device | cpu15 %tim------------- +--------------------- qlc#1 | 33644 70.5
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 15 0 0 97386 33451 33244 1421 471 0 8434 0 0 0 96 0 4
# #intrstat,mpstat outputs device | cpu2 %tim cpu15 %tim-------------+--------------------------------------------- qlc#1 | 0 0.0 34793 40.9 qlc#5 | 34259 40.7 0 0.0
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 2 0 0 48854 34999 34135 2217 41 310 2918 0 1081 2 69 0 29 15 0 0 48668 35057 34716 147 0 4 2595 0 10 0 66 0 34
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1232
I/O Performance
Tips: –Consider isolating interrupt CPUs using processor sets
Improved I/O Performance Diagnosability in Next Generation Database– Dtrace tightly integrated with Database monitoring framework
– New V$ view(s) added for reporting I/O response time breakup
High I/O response time (contd.)
# #intrstat,mpstat outputs device | cpu9 %tim------------- +--------------------- qlc#5 | 22862 52.6CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 9 0 0 29881 24613 22324 7017 163 2856 164 0 3302 3 89 0 8
# #intrstat,mpstat outputs after processor set device | cpu9 %tim------------- +--------------------- qlc#5 | 24230 53.3CPU minf mjf xcal Intr ithr csw icsw migr smtx srw syscl usr sys wt idl 9 0 0 29602 23884 23633 48 10 0 3327 0 0 0 80 0 20
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1233
I/O Performance
Make sure HBAs are installed according to best practices Tips:
– Make sure I/O is spread over all available controllers/channels/ports
– Consider enabling MPXIO (in case of SAN) for smoothening response time Use stmsboot(1M) or modify /kernel/drv/fp.conf (for fibre-channel devs)“mpathadm list lu” command shows the total operational paths to each Lun:
High I/O response time (contd.)
# mpathadm list lu snip /dev/rdsk/c0t000B080012004133d0s2 Total Path Count: 8 Operational Path Count: 8 snip
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1234
Before Flash Cache
Notes– 15 minute snapshots ‘under load’: but 9.5 hours of buffer waits!
– 77 minutes of commit time
Top 5 Timed Foreground Events Avg %Total~~~~~~~~~~~~~~~~~~ wait CallEvent Waits Time (s) (ms) Time Wait Class---------------------------- ---------- --------- ------ ---- ---------db file sequential read 3,189,229 34,272 11 67.8 User I/OCPU time 11,332 22.4log file sync 2,247,374 4,612 2 9.1 Commitgc cr grant 2-way 1,365,247 793 1 1.6 Clusterenq: TX – index contention 140,257 720 5 3.1 Concurrenc -------------------------------------------------------------
What is the bottleneck here?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1235
11gR2 Database Smart Flash Cache
Acts as Level 2 SGA Changes physical read I/O to
logical I/O Rule of sizing: 2x – 10x SGA
size Best accelerates read intensive
workloadsFew I/O’s
Buffer Cache
Storage
Buffer Cache
Database SmartFlash Cache
Many I/O’s
Storage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1236
Database Smart Flash Cache Setup
Aggregate Flash Modules to pool– ASM preferred
– Concat. No mirroring - it is a cache!
Set two init.ora parameters– db_flash_cache_file = <+flashdg/FlashCacheFile>
Path to flash file/raw aggregation/metadevice
– db_flash_cache_size = <flash pool size> L2 buffer cache size: amount of flash to use
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1237
After Flash Cache
Results: 140x reduction in ‘db file sequential reads’!– Average Flash Cache Read time 540us vs 10.75ms: 20X quicker.
– Transaction and commit rate also went up over 40%!
Top 5 Timed Foreground Events Avg %Total~~~~~~~~~~~~~~~~~~ wait CallEvent Waits Time (s) (ms) Time Wait Class---------------------------- ---------- --------- ------ ---- ---------CPU time 11,353 57.6log file sync 1,434,247 6,587 3 33.4 Commitflash cache single block read 4,221,599 2,284 1 21.3 User I/OBuffer busy waits 723,807 1,502 329 3.3 Concurrencdb file sequential read 22,727 182 8 .9 User I/O -------------------------------------------------------------
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1238
‘db file sequential read’ Summary
Majority of OLTP Databases have this wait event At Minimum - 20X reduction in storage response time common with flash
vs HDD in arrays. – Due to Lower Latency
2x improvement in SLA typical when I/O bound.– 5x and higher improvements seen.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1239
How Cloning Can Improve Database Lifecycle Management
Application development and testing Infrastructure updates without downtime Troubleshooting
Let’s take a systems approach to improving performance
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1240
Cloning Oracle Databases and Virtual Machines on Shared Storage
Datafiles stored over ASM
Production System
RMAN Backup
Dev/Test System
Oracle ZFS Storage Appliance
Z1
Datafiles: only updates are written to storage
Z4 Z5Z3Z2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1241
Database Cloning BenefitsOracle’s Approach for Cloning Oracle Database
Runtime zone is cloned in minutes into several identical zones with all the zones and database running upon completion
Can have unlimited number of clones Clones take up minimal space, only updates are written to disk
No performance impact on production database
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1242
Recap: What we just discussed
Oracle Optimized Solutions Solaris performance tools Leverage Virtualization and Solaris/Oracle Database resource management Use shared Oracle distribution for Database Consolidation Bump up 'vmtasks' to improve Database startup time Limit ZFS ARC to improve large-page availability and Database startup time Isolate Database instances in a controlled environment for reproducible performance Use FX scheduling class to minimize performance variance Several possible tunings to reduce Log File Sync time Isolate interrupt CPUs and enable MPXIO for better I/O performance Cloning to improve Database Lifecycle Management and provide rapid OS and Database
setup
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1243
Takeaways
Performance optimization is an iterative process “What is not measured does not improve “
– Continuous enhancements to Solaris observability framework
– Enhanced Observability for better Diagnosability
– Symptoms visible in more than one place
Variety of tools and techniques available for Systems level tuning Achieve consistent performance with Solaris tunings Strive to create Well Balanced Systems
– Consider all components and build a complete Solutions Approach
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1244
References
• How to Consolidate, Optimize, and Deploy Oracle Database Environments with Flexible Virtualized Service
How to Accelerate Test and Development Through Rapid Cloning of Production Databases and Operating Environments
Deploying Oracle Database on the Oracle Solaris Platform
Solaris Dynamic Tracing Guide
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1245
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1246