experience in running relational databases on clustered storage ruben.gaspar.aparicio_@_cern.ch on...
TRANSCRIPT
![Page 1: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/1.jpg)
![Page 2: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/2.jpg)
Experience in running relational databases on clustered storage
Ruben.Gaspar.Aparicio_@_cern.ch
On behalf IT-DB storage team, IT Department
HEPIX 2014
Nebraska Union, University of Nebraska – Lincoln, USA
![Page 3: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/3.jpg)
3
Agenda• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 4: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/4.jpg)
4
Agenda• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 5: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/5.jpg)
CERN’s Databases• ~100 Oracle databases, most of them RAC
• Mostly NAS storage plus some SAN with ASM• ~500 TB of data files for production DBs in total
• Examples of critical production DBs:• LHC logging database ~190 TB, expected growth up to ~70 TB /
year• 13 production experiments’ databases ~10-20 TB in each• Read-only copies (Active Data Guard)
• But also as DBaaS, as single instances• 148 MySQL Open community databases (5.6.17)• 15 Postgresql databases (being migrated to 9.2.9)• 12 Oracle11g → migrating towards Oracle12c multi-tenancy
5
![Page 6: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/6.jpg)
Use case: Quench Protection System• Critical system for LHC operation
• Major upgrade for LHC Run 2 (2015-2018)
• High throughput for data storage requirement• Constant load of 150k changes/s from 100k signals
• Whole data set is transfered to long-term storage DB• Query + Filter + Insertion
• Analysis performed on both DBs
6
Backup
LHC Logging(long-termstorage)
RDB Archive16 ProjectsAround LHC
![Page 7: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/7.jpg)
7
Quench Protection system: tests
After two hours buffering
Nominal conditionsStable constant load of 150k changes/s 100 MB/s of I/O operations 500 GB of data stored each day
Peak performanceExceeded 1 million value changes per second 500-600 MB/s of I/O operations
![Page 8: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/8.jpg)
Oracle basic setup
Mount Options for Oracle files when used with NFS on NAS devices (Doc ID 359515.1)
global namespace
Oracle RAC database at least 10 file systems
10GbE
10GbE
10GbE12gbps
gpnPrivate network (mtu=9000)
mtu=1500
![Page 9: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/9.jpg)
9
Oracle file systemsMount point Content
/ORA/dbs0a/${DB_UNIQUE_NAME} ADR (including listener) /adump log files
/ORA/dbs00/${DB_UNIQUE_NAME} Control File + copy of online redo logs
/ORA/dbs02/${DB_UNIQUE_NAME} Control File + archive logs (FRA)
/ORA/dbs03/${DB_UNIQUE_NAME}* Datafiles
/ORA/dbs04/${DB_UNIQUE_NAME} Control File + copy of online redo logs + block change tracking file + spfile
/ORA/dbs0X/${DB_UNIQUE_NAME}* More datafiles volumes if needed
/CRS/dbs00/${DB_UNIQUE_NAME} Voting disk
/CRS/dbs02/${DB_UNIQUE_NAME} Voting disk + OCR
/CRS/dbs00/${DB_UNIQUE_NAME} Voting disk + OCR
* They are mounted using their own lif to ease volume movements within the cluster
![Page 10: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/10.jpg)
10
Agenda• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning • Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 11: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/11.jpg)
11
Netapp evolution at CERN (last 8 years)
FAS3000 FAS6200 & FAS8000
100% FC disks Flash pool/cache = 100% SATA disk + SSD
DS14 mk4 FC DS4246
6gbps2gbps
Data ONTAP®7-mode
Data ONTAP®Cluster-Mode
scaling up
scaling out
![Page 12: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/12.jpg)
12
A few 7-mode concepts
Private network
FlexVolume
Remote Lan Manager
Service Processor
Rapid RAID Recovery
Maintenance center (at least 2 spares)
raid_dp or raid4raid.scrub.schedule
raid.media_scrub.rate
once weekly
constantly
reallocate
Thin provisioning
File access
Block access
NFS, CIFS FC,FCoE, iSCSI
autosupportclient access
![Page 13: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/13.jpg)
13
Private network
Cluster interconnect
Cluster mgmt network
A few C-mode concepts cluster
node shell
systemshellC-mode
C-mode
cluster ring showRDB: vifmgr + bcomd + vldb + mgmt
Vserver (protected via Snapmirror)
Global namespaceLogging files from the controller no longer accessible by simple NFS export
Logical Interface (lif)
client access
![Page 14: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/14.jpg)
Consolidation
56 controllers (FAS3000) & 2300 disks (1400TB storage)
…
Storage islands, accessible via private network1 2
7
14 controllers (FAS6220) & 960 disks (1660 TB storage)
Easy management Difficulties finding slots for interventions
![Page 15: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/15.jpg)
RAC50 setup
• Cluster interconnect, using FC gbic’s for distance longer than 5m.
• SFP must be from CISCO (if CISCO switches in use)
Cv
cluster interconnect
primary switch (private network)
secondary switch (private network)
![Page 16: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/16.jpg)
16
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 17: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/17.jpg)
Flash cache• Helps increase random IOPS on disks
• Warm-up effect (options flexscale.rewarm )• cf operations (takeover/giveback) invalidate the
cache, user initiated ones do not since ONTAP 8.1
• TR-3832 :Flash Cache Best Practice Guide (Technical Report from Netapp)
• For databases• Decide what volumes to cache:
fas3240>priority on fas3240>priority set volume volname cache=[reuse|keep]
• options flexscale.lopri_blocks off
Flash Cache Flash Pool
![Page 18: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/18.jpg)
18
Flash cache: database benchmark• Inner table (3TB) where a row = a block (8k). Outer table (2% of
Inner table) each row contains rowid of inner table • v$sysstat ‘physical reads’
• Starts with db file sequential read but after a little while changes to db file parallel read
Random Read IOPS*
No PAM PAM + Kernel NFS (RHE5)
PAM + dNFS
First run 2903 2890 3827
Second run 2900 16397 37811
*fas3240, 32 disks SATA 2TB, Data Ontap 8.0.1, Oracle 11gR2
~ 472 data disks
~240 data disks
![Page 19: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/19.jpg)
Flash cache: long running backups…• During backups SSD cache is flushed• IO latency increases – hit% on PAM goes down ~ 1%• Possible solutions:
• Data Guard• priority set enabled_components=cache• Large IO windows to improve sequential IO detection, possible in
C-mode: vserver nfs modify -vserver vs1 -v3-tcp-max-read-size 1048576
© by Luca Canali
![Page 20: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/20.jpg)
20
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning in Oracle12c• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 21: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/21.jpg)
Flash pool aggregates• 64 bits aggregates• Aggregate with snapshots, they must be deleted before converting into
hybrid aggregate• SSD rules: minimum number and extensions depending on the model
e.g. FAS6000 9+2, 6 (with 100GB SSD)• No mixed type of disks in a hybrid aggregate: just SAS + SSD, FC +
SSD, SATA + SSD. No mixed type of disks in a raid_gp.• You can combine different protection levels among SSD RAID and HDD
RAID, e.g. raid_dp or raid4• Hybrid aggregate can not be rollbacked • If SSD raid_gps are not available the whole aggregate is down• SSD raid_gps doesn’t count in total aggregate space• Maximum SSD size depending on model & ONTAP release (https://
hwu.netapp.com/Controller/Index ). • TR-4070: Flash Pool Design and Implementation Guide
Flash Cache Flash Pool
![Page 22: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/22.jpg)
22
Flash pool behaviour
• Blocks going into SSD determined by Write and Read policies. They apply to volumes or globally on whole aggregate.
• Sequential data is not cached. Data cannot be pinned• Heat map in order to decide what stays and for how long in SSD cache
random overwrites, size < 16Kb
Write to disk
read read
overwrite
Eviction scanner
Eviction scanner
Insert into SSD
Insert into SSD
read
write
Every 60 secs & SSD consumption > 75%
hot warm neutral cold evict
evictcoldneutral
![Page 23: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/23.jpg)
23
Flash pool: performance counters• Performance counters: wafl_hya_per_aggr (299) & wafl_hya_per_vvol (16)
• We have automated the way to query those:
Around 25% difference in an empty system:Ensures enough pre-erased blocks to write new data
Read-ahead caching algorithms
![Page 24: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/24.jpg)
24
Flash pool behaviour: warm-up times
• Using fio: 500GB dataset, random IO
• Read cache warms slower than write cache
• Reads (ms) costs more than writes (μs)
• Stats of SSD consumption can be retrieved using: wafl_hya_per_vvol
object, at nodeshell in diagnostic level.
6 hours
![Page 25: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/25.jpg)
25
Flash pool: random reads
1 2 4 8 16 32 48 56 64 78 92 112 1280
10000
20000
30000
40000
50000
60000
980 2135 44048826
16134
27751
3755040817
44051 43930 43599 42651 40983
3932 5222 725611366
19660
34182
4429748058 47828 45368 47127 46234 46160
105 245 516 997 1677 2637 3690 4184 4415 4676 5031 5427 5810116 234 494 983 1746 2797 3424 3835 4155 4432 4727 5041 5352
db file sequential read (IOPS), Oracle 11.2.0.4, red hat 6.4, SLOB2 1TB dataset
SSD_mtu1500
SSD_mtu9000
noSSD_mtu1500
noSSD_mtu9000
Number of sessions
1 2 4 8 16 32 48 56 64 78 92 112 1280
5000
10000
15000
20000
25000
997 912 882 879 964 1122 1249 1340 1418 1731 2058 2563 3046
93798090 7676 7942
946712019 12864 13226 14331
1647018056
2032621679
241 369 536 687 793 911 1056 1136 1305 1678 1906 2366 2712
8984 8447 8041 8066 906211350
13840 14475 1524517413
1920421934
23606
Average latency (µs) - db file sequential read
SSD_mtu1500
noSSD_mtu1500
SSD_mtu9000
noSSD_mtu9000
Number of sessions
Equivalent to 600 HDDJust 48 HDD in reality
![Page 26: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/26.jpg)
26
1TB dataset, 100% in SSD, 56 sessions, random reads
![Page 27: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/27.jpg)
10TB dataset, 36% in SSD, 32 sessions, random reads
![Page 28: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/28.jpg)
28
Flash pool: long running backups
Full backup running for 3,5 days
![Page 29: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/29.jpg)
29
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 30: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/30.jpg)
Vol move• Powerful feature: rebalancing, interventions,… whole volume granularity• Transparent but watch-out on high IO (writes) volumes• Based on SnapMirror technology
Initial transfer
rac50::> vol move start -vserver vs1rac50 -volume movemetest -destination-aggregate aggr1_rac5071 -cutover-window 45 -cutover-attempts 3 -cutover-action defer_on_failure
Example vol move command:
![Page 31: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/31.jpg)
31
Oracle12c: online datafile move
• Very robust, even with high IO load• It takes advantage of database memory buffers
• Works with OMF (Oracle management files)
• Track it at alert.log and v$session_longops
![Page 32: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/32.jpg)
32
Oracle12c: online datafile move (II)
alter database move datafile Move was completed.
![Page 33: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/33.jpg)
33
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Backup and Recovery: snapshots• Cloning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 34: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/34.jpg)
34
DBaaS:Backup management• Same backup procedure for all RDBMS• Backup workflow:
mysql> FLUSH TABLES WITH READ LOCK;mysql> FLUSH LOGS;
orOracle>alter database begin backup;
OrPostgresql> SELECT pg_start_backup('$SNAP');
mysql> UNLOCK TABLES;Or
Oracle>alter database end backup;or
Postgresql> SELECT pg_stop_backup(), pg_create_restore_point('$SNAP');
snapshotresume
… some time later
new snapshot
![Page 35: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/35.jpg)
35
Snapshots in Oracle• Storage-based technology• Speed-up of backups/restores: from hours/days to seconds • Handled by a plug-in on our backup and recovery solution: /etc/init.d/syscontrol --very_silent -i rman_backup start -maxRetries 1 -exec takesnap_zapi.pl -debug -snap dbnasr0009-priv:/ORA/dbs03/PUBSTG level_EXEC_SNAP -i pubstg_rac50
• Example:
• Drawback: lack of integration with RMAN • Ontap commands: snap create/restore
• Snaprestore requires license
• snapshots not available via RMAN API• But some solutions exist: Netapp MML Proxy api, Oracle snapmanager
pubstg: 280GB size, ~ 1 TB archivelogs/day
8secs
adcr: 24TB size, ~ 2,5 TB archivelogs/day
lif Global namespace
9secs
![Page 36: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/36.jpg)
36
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning• Backup to disk• direcNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 37: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/37.jpg)
37
Cloning• On the storage backend, a new license is required• Offers new possibilities, especially on clustered
databases• Oracle12c multi-tenancy and PostgreSQL, both
include especial SQL to handle cloning
• TR-4266: NetApp Cloning Plug-in for Oracle Multitenant Database 12c
• It can combine with any strategy of data protection that involves snapshots (still not there)• E.g. create a cloned db from a backup snapshot for
fast testing an application upgrade
![Page 38: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/38.jpg)
38
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Clonning• Backup to disk• direcNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 39: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/39.jpg)
Backup architecture• Custom solution: about 15k lines of code, Perl + Bash• Export policies restrict access to the NFS shares to DB servers• Extensive use of compression and some deduplication • Storage compression does not require license. Oracle Advanced
Compression license required for low, medium and high types (see later)
We send compressed:• 1 out of 4 full backups• All archivelogs
![Page 40: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/40.jpg)
40
Backup to disk: throughput (one head)
data scrubbing
compression ratio
compression ratio
720 TB used603TB saved due to compression mainly but also deduplication (control files)
![Page 41: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/41.jpg)
Oracle12c compression• Oracle 11.2.0.4, new servers (32 cores,129GB RAM)
• Oracle 12.1.0.1 new servers (32 cores,129GB RAM)
no-compressed (t) basic low medium high No-compressed-fs
Inline-compressionNetapp 8.2P3
392GB (devdb11) 62.24GB(1h54’) 89.17GB (27’30’’) 73.84GB (1h01’) 50.71GB (7h17’) 349GB(22’35’’) 137GB(22’35’’)
Percentage saved (%) 82% 74.4% 78.8% 85.4% 0% 62%
no-compressed (t) basic low medium high No-compressed-fs
Inline-compressionNetapp 8.2P3
376GB (devdb11 upgraded to 12c)
45.2GB (1h29’) 64.13GB (22’) 52.95GB (48’) 34.17GB (5h17’) 252.8GB(22’) 93GB(20’)
Percentage saved (%) 82.1% 74.6% 79% 86.4% 0% 64.5%
229.2GB (tablespace using Oracle Crypto)
57.4GB (2h45’) 57.8GB (10’) 58.3GB (44’’) 56.7GB (4h13’) 230GB(12’30’’) 177GB(15’45’’)
Percentage saved (%) 74.95% 74.7% 74.5% 75.2% 0% 22.7%
Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
![Page 42: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/42.jpg)
42
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Clonning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 43: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/43.jpg)
Oracle directNFS• Oracle12c, enable dNFS by:
$ORACLE_HOME/rdbms/lib/make -f ins_rdbms.mk dnfs_on
• Mount Options for Oracle files when used with NFS on NAS devices [ID 359515.1]• RMAN backups for disk backups kernel NFS [ID 1117597.1]• Linux/NetApp: RHEL/SUSE Setup Recommendations for NetApp Filer Storage (Doc ID 279393.1)
![Page 44: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/44.jpg)
44
Oracle directNFS (II)• dnfs vs Kernel NFS stack while using parallel
queries
![Page 45: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/45.jpg)
45
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Clonning• Backup to disk• direcNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 46: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/46.jpg)
46
In-house tools• Main aim is to allow access to the storage for
our DBAs and system admins.• Based on ZAPI (Netapp storage API) and
programmed in Perl and Bash. It’s about 5000 lines of code
• All scripts work on C-mode and 7-mode → no need to know how to connect to the controllers and different ONTAP CLI versions
![Page 47: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/47.jpg)
47
In-house tool: snaptool.pl
• create, list, delete, clone, restore…
• e.g.
• API available programmatically
![Page 48: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/48.jpg)
48
• Check online statistics of a particular file system or controller serving it
• Volume stats & histograms:
In-house tool: smetrics
![Page 49: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/49.jpg)
49
Agenda• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Clonning in Oracle12c• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 50: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/50.jpg)
50
Netapp monitoring/mgmt tools• Unified OnCommand Manager 5.2 (linux)
• Authentication using PAM• Extensive use of reporting (in 7-mode)• Work for both 7-mode and C-mode• Performance management console (performance counters display)• Alarms
• OnCommand Performance Manager (OPM) & OnCommand Unified Manager (OUM)• Used for C-mode • Virtual machine (VM) that runs on a VMware ESX or ESXi Server
• System Manager• We use it mainly to check setups
• My Autosupport at NOW website
![Page 51: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/51.jpg)
51
Agenda• CERN intro• CERN databases basic description• Storage evolution using Netapp • Caching technologies
• Flash cache• Flash pool
• Data motion• Snapshots• Cloning• Backup to disk• directNFS• Monitoring
• In-house tools• Netapp tools
• iSCSI access• Conclusions
![Page 52: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/52.jpg)
52
iSCSI SAN • iSCSI requires an extra license
• All storage features available for luns• Access through CERN routable network
• Using Netapp CINDER (Openstack block storage implementation) driver
• Exports iSCSI SAN storage to KVM or Hyper-V hypervisors
• Solution is being tested
![Page 53: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/53.jpg)
53
Conclusions• Positive experience so far running on C-mode• Mid to high end NetApp NAS provide good
performance using the FlashPool SSD caching solution
• Flexibility with cluster ONTAP, helps to reduce the investment
• Design of stacks and network access require careful planning
![Page 54: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/54.jpg)
54
Acknowledgement• IT-DB colleagues, especially Lisa Azzurra
Chinzer and Miroslav Potocky members of the storage team
![Page 55: Experience in running relational databases on clustered storage Ruben.Gaspar.Aparicio_@_cern.ch On behalf IT-DB storage team, IT Department HEPIX 2014](https://reader036.vdocuments.site/reader036/viewer/2022062320/56649d6e5503460f94a4fb99/html5/thumbnails/55.jpg)
55
Questions