living with the oracle database appliance

54
Living with the Oracle Database Appliance Simon Haslam, Veriton Peter Moore, Simplyhealth

Upload: veriton-limited

Post on 07-Aug-2015

330 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Living with the Oracle Database Appliance

Living with the

Oracle Database Appliance

Simon Haslam, Veriton Peter Moore, Simplyhealth

Page 2: Living with the Oracle Database Appliance

Simon Haslam Consultant, Veriton &

Technical Director of

Oracle s/w since 1995

Middleware & SOA

WebLogic, SOA, BPM

Peter Moore Principal Oracle DBA & MW Admin, Simplyhealth

Oracle s/w since 1988

Oracle DBA for 19 years

Database Administrator

Page 3: Living with the Oracle Database Appliance

Introduction & Background

ODA BM/VP & Sizing of Recovery Area

Hardware Maintenance (ASR & Disk Failures)

Patching

Miscellaneous

Page 4: Living with the Oracle Database Appliance

What is ODA?

Two fast Intel compute nodes

Shared, direct attached storage array including flash

InfiniBand interconnect & 10Gb public networks

Management software (database & virtualisation)

Sold as a single product for $68k (list)

in a slide!

Page 5: Living with the Oracle Database Appliance

Bulk Data HDD

Redo Logs

ODA Cache SSD

Compute Node

Compute Node HDD

Now with

InfiniBand

Page 6: Living with the Oracle Database Appliance
Page 7: Living with the Oracle Database Appliance

Background

Started in 1872 ◦ Previously… HSA, BCWA, HealthSure, LHF, Remedi, Medisure, Denplan

Primary business areas ◦ Health Cash Plans ◦ Private Medical Insurance ◦ Dental Capitation ◦ Healthcare delivery

Over 3M customers / 20,000 companies ~1700 Employees

Page 8: Living with the Oracle Database Appliance

Core IT

Product / CRM / Finance Application

~1000 Users / 600 Active

3M Customer records

Java EE and PL/SQL

3rd Party communications platform

RAC (2TB main db), WebLogic, Reports

Page 9: Living with the Oracle Database Appliance

ZFS Appliance

Simplyhealth’s ODAs Production Test

ODA Base

OLTP Reporting standby

Comms

ODA Base

TTD container VM 1

TTD container VM 2

ODA Base ODA Base

OLTP standby

Comms standby

Test

Reporting

Reporting

APEX portal

RMAN OLTP

archive

RMAN standby OLTP

UAT Comms

UAT

Test

Page 10: Living with the Oracle Database Appliance

ODA BM/VP & Sizing of Recovery Area

Page 11: Living with the Oracle Database Appliance

13 | 10 13 • 50

Virtualized Platform: databases

Database

Each node has a “ODA Base”

DomU

Looks a lot like ODA BM – most

admin done from ODA Base

Nodes

Run a special OVS image

Appliance Manager

GUI when you first provision it

oakcli tool

Node 0 - OVS

ODA Base (DomU) • Appliance Manager • Database(s) • Grid Infrastructure

Node 1 - OVS

ODA Base (DomU) • Appliance Manager • Database(s) • Grid Infrastructure

Dom0 Dom0 Repo Repo

Local Local Shared Storage

Lots of room for app VMs like SOA

Page 12: Living with the Oracle Database Appliance

ODA BM or VP?

Simplyhealth chose ODA VP ◦ Initially driven by WebLogic

◦ Turned out to be good for test databases

If in doubt Simon recommends ODA VP: ◦ gives you more flexibility in future (app & probably database)

◦ only moderate extra operational complexity

Page 13: Living with the Oracle Database Appliance

Sizing of RECO

DATA is on outer part of hard disks, RECO on inner

Only set during initial provisioning

RECO

DATA

RECO

DATA

RECO

DATA

Default: “Local Backup” “External Backup”

DATA

RECO

DATA

RECO

DATA

RECO

Page 14: Living with the Oracle Database Appliance

DATA:RECO Sizes

Disks are physically partitioned according to whether Local or External Backup was chosen

Same ratios for all ODA hardware versions and HIGH/NORMAL redundancy

DATA 43% RECO 57%

DATA 86% RECO 14%

“Local Backup”

“External Backup”

OUTER

OUTER

INNER

INNER

Page 15: Living with the Oracle Database Appliance

Usable Space Example ODA X5-2, 1 shelf, NORMAL redundancy

DATA 12TB RECO 16TB

DATA 24TB RECO 4TB

“Local Backup”

“External Backup”

REDO 250GB

FLASH 750GB

Page 16: Living with the Oracle Database Appliance

Hardware Maintenance (ASR & Disk Failures)

Page 17: Living with the Oracle Database Appliance

My Oracle Support Set up

Use a team MOS account + group email dist. list

Ensure MOS account has access to correct ODA CSI(s)

Page 18: Living with the Oracle Database Appliance

MOS

Oddity: you can only activate ASR on the ODA nodes so why this

warning/button? (you don’t get this on ZFSSA)

Page 19: Living with the Oracle Database Appliance

ASR Set up

Stand-alone ASR on each ODA

Each server needs internet access https://transport.oracle.com

oakcli configure asr

Page 20: Living with the Oracle Database Appliance

ASR Test

Option 1: Internal ASR Enter root password (x2) Enter MOS credentials

Page 21: Living with the Oracle Database Appliance

ASR Disk failure example

Page 22: Living with the Oracle Database Appliance
Page 23: Living with the Oracle Database Appliance
Page 24: Living with the Oracle Database Appliance

ASR Funnies

ASR raises one SR per disk… or none… or two…

Sometimes the first time you know that a disk has failed has been when Oracle has updated the SR ◦ New ODA plug-in for EM is expected to include hardware

notifications

Page 25: Living with the Oracle Database Appliance

ASR Further Diagnostics

Page 26: Living with the Oracle Database Appliance

Our Disk History

We have 2 x dual shelf ODA X3-2s 16 SSD & 88 HDD Running for 1.5 years (1.35M HDD-hours) Total of 6 HDDs have been replaced (i.e. 225k h MTBF) ◦ 5 predicted failures ◦ 1 real failure… bad experience with I/O waits though

No SSDs have failed

Note: new ZFS SA disk arrived automatically next morning without sys admin knowing it had failed! (ODA should be more like this)

Page 27: Living with the Oracle Database Appliance

Disk Failure ‘Gotchas’

1 predicted failure fixed itself! General fiddliness of replacing disks ◦ Firmware updating, getting new disks ONLINE, etc ◦ MOS 1435946.1 & 1496114.1

The replacement disk includes the courier details to collect the failed one… ◦ this is a European courier who will know nothing about it! ◦ we need the UK courier

Blinking yellow light doesn’t always work?!

Page 28: Living with the Oracle Database Appliance

Patching

Page 29: Living with the Oracle Database Appliance

Patching: It’s Really Good!

Vastly simplified process compared to DIY for full stack

Approx. quarterly ODA-only bundled patches ◦ includes PSU for databases (optional)

Oracle Support says <=2 versions behind current

There’s probably a backlog of ODA customers on 2.10 (last 11g GI but CPU only to April 2014)

Page 30: Living with the Oracle Database Appliance

prep • Download & load to patch repositories on ODA nodes

INFRA • Update INFRA

GI • Update GI

db • (optional) Update database Oracle Homes & databases

Page 31: Living with the Oracle Database Appliance

Upgrade Example ODA 2.10 to 12.1.2.2.0 INFRA, GI, DB PSU

11g12c CRS/ASM upgrade would have probably been a project pre-ODA

We only have a single 11.2.0.4.x Oracle Home ◦ some people have several, e.g. for different apps

Page 32: Living with the Oracle Database Appliance

prep

• scp p20340774_121220_Linux-x86-64_[12]of2.zip • oakcli unpack –package p20340774… {for each zip, on each node} • oakcli update -patch 12.1.2.2.0 --verify

INFRA • oakcli update –patch 12.1.2.2.0 --infra

GI • oakcli update –patch 12.1.2.2.0 --gi

db • oakcli update –patch 12.1.2.2.0 --database

Page 33: Living with the Oracle Database Appliance

Lost 1h 10min

12c GI / 11g PSU Upgrade Timeline

--infra 2h 29min

--gi 1h 12min

--d.b. 40min

App Prep. 1h

Elapsed outage for app ~6h

Restarting app etc

Supposed to be rolling?

(all DBs shutdown)

Supposed to be rolling?

Both nodes rebooted automatically

Database were open for most of day but we were never sure when they would be shut down… (our lack of experience of ODA patching?)

Possibly bug in shared repo upgrade

Page 34: Living with the Oracle Database Appliance
Page 35: Living with the Oracle Database Appliance

What happened under the covers? INFRA updates

◦ BIOS ◦ ILOM ◦ Firmware updated on all disks (except new ones) ◦ OVM 3.2.9

GI updates ◦ CRS 12.1.0.2.2 ◦ ASM 12.1.2.x.0 (i.e. inc Flex ASM) ◦ ODA Base to Oracle Linux 5.10 UEK2

Database PSU ◦ Oracle home to 11.2.0.4.5 (plus 12.1.0.2.2, 11.2.0.3.13 if we had them) ◦ Databases updated (some!)

…and probably much more!

Page 36: Living with the Oracle Database Appliance

DB Patch-Set Update

Choose which Oracle Home(s) to apply PSU to

Script loops through databases running in each updated home & runs catbundle.sql ◦ Recognises standbys - didn’t apply PSU (correctly) but still

shut them down! Perhaps because they shared the home being patched? Possibly our fault!

Page 37: Living with the Oracle Database Appliance
Page 38: Living with the Oracle Database Appliance

Strange Error Messages

Some strange messages, but mostly harmless: ◦ Console: “An error occurred while restoring domain oakDom1: Error: not a valid guest state file: config size read”

But… 2 of us were watching everything very closely ◦ Probably better to just go for a long lunch instead!

Page 39: Living with the Oracle Database Appliance

Patching Wish List

Status/confidence ◦ more timestamps (for checking back later – test vs prod)

◦ a progress indicator for anything taking over ~3 min e.g. “INFO: Running prepatching on node 0” ~20 mins

Could firmware updates of disks (35 mins) be done in parallel?

Page 40: Living with the Oracle Database Appliance

Patching Wish List

Help us to understand which parts of process are rolling (could be different per ODA version) and how to minimise downtime ◦ Is INFRA ever rolling?

◦ GI rolling?

◦ DB rolling if using RAC or RON?

Page 41: Living with the Oracle Database Appliance

Patching Nirvana:

Rolling Upgrades for Everything?!

Size of ODA X5-2 invites DB consolidation

Simplyhealth: Lack of rolling INFRA will drive all non-UAT databases off test ODA (v hard to test bundled patches on pre-prod/UAT)

O-box SOA Appliance: sold on strength as HA so need rolling updates below WebLogic layer

Page 42: Living with the Oracle Database Appliance

Miscellaneous

Page 43: Living with the Oracle Database Appliance

NFS Storage for Databases

Oracle ZFS and NFS (e.g. NetApp) is supported ◦ See MOS 1445253.1: External Storage (read/write) Support

◦ Use files over NFS, not via ASM

Uses Direct NFS (dNFS) fast ◦ we have 10 GbE network dedicated to storage

Not so self-contained so perhaps not “the ODA way”

Page 44: Living with the Oracle Database Appliance

An Innovative Approach for Test DBs

Requirement: ◦ To use DB EE NUP licences for test, when the 2 ODA bases are

licensed by RAC processor

Solution: ◦ One large VM on each node with multiple Linux Containers ◦ Test databases within the containers use ZFS SA for storage

Suffers from lack of rolling upgrades for ODA INFRA Technical Credit/Implementation:

Mark Leeuw & Fabrizio Bordaccini

Page 45: Living with the Oracle Database Appliance

Backup & Disaster Recovery

Data Guard works well of course ODA VP & ODA Base? ◦ In practice you need to rebuild

VMs running on ODA VP? ◦ Host level backup within VM ◦ ACFS Replication...?

Oracle White Paper: Backup and Recovery Best Practices for the Oracle Database Appliance (April 2014)

Page 46: Living with the Oracle Database Appliance

Management

Looking forward to trying the new EM 12c R4 ODA plug-in

Initial ODA VP imaging ◦ Why can’t ODA come with VP image?

◦ Speed of booting .ISO over ILOM if not local

Page 47: Living with the Oracle Database Appliance

Tips

Keep It Simple! ◦ Don’t stray too far from standard ODA design goals

◦ Custom databases running off vDisks will end in tears!

Don’t mess with BIOS! ◦ Simon’s don’t-do-this-at-home node eviction test

Page 48: Living with the Oracle Database Appliance

Summary

Page 49: Living with the Oracle Database Appliance

Choose Wisely!

ODA Bare Metal or Virtualized Platform

Internal or External Backup

Double (NORMAL) or Triple (HIGH) Mirrored

Page 50: Living with the Oracle Database Appliance

Hardware

ASR is useful

Disks – replacement process needs improvement

Page 51: Living with the Oracle Database Appliance

Patching

Probably the best feature of ODA

The gift that keeps on giving! ◦ Over lifetime of an ODA you might patch/upgrade 10 or more

times

Page 52: Living with the Oracle Database Appliance

Oracle Database Appliance VP

It Just Works*™ *99%!

Page 53: Living with the Oracle Database Appliance
Page 54: Living with the Oracle Database Appliance

@simon_haslam @petercmoore