release, timing, and pricing of any features or ... · exadata database machine: maximum...
TRANSCRIPT
The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website at http://www.oracle.com/investor. All information in this presentation is current as of September 2019 and Oracle undertakes no duty to update any statement in light of new information or future events.
Safe Harbor
Copyright © 2019 Oracle and/or its affiliates.
Exadata Database Machine: Maximum Availability Architecture (MAA)
Technical Presentation
April, 2020
Program Agenda
Exadata & Maximum Availability Architecture
MAA Reference Architectures
MAA Features in Exadata
MAA Exadata Lifecycle Operations
Summary
1
2
3
3
4
5
Copyright © 2020 Oracle and/or its affiliates.
Exadata Database Machine: Maximum Availability Architecture (MAA)
Exadata & Maximum Availability Architecture
Copyright © 2020 Oracle and/or its affiliates.
5
MAA Solutions: On-Premises to Cloud
On-Premises
On-Premises Exadata and Recovery Appliance
DBCS/ExaCS/ExaCC
Autonomous Database
MAA Reference Architectures and Best Practices
MAA integrated Engineered Systems(config practices, exachk, lowest brownouts, HA QoS, data protection)
Adding MAA Config and Life Cycle Operations, Shifting Admin Ownership
to Oracle with MAA SLAs
Copyright © 2020 Oracle and/or its affiliates.
6
Average cost of downtime per hour
Average cost of unplanned data center outage or disaster
Average amount of downtime per year
Percentage of companies that have experienced an unplanned data center outage in the last 24 months
Impact of Database Downtime
91%
$10M$350K
Source: Gartner, Data Center Knowledge, IT Process Institute, Forrester Research
87 hours
Copyright © 2020 Oracle and/or its affiliates.
Improve service while increasing return on investment
7
High Availability (HA) Business Challenges
Eliminate risk of downtime and data loss
Copyright © 2020 Oracle and/or its affiliates.
Protection from Planned & Unplanned Outages
Exadata Addressing High Availability Challenges
8
Type of Outage High Availability Challenges Protection using Exadata
Planned Outages
Disruptive Schema Changes due to application changes to meet ever-changing business requirements
Schema Changes impacts are greatly reduced with faster changes, index and object rebuilds and reorganizations
Downtime required for lifecycle management like periodic upgrades of firmware & software, data migration
Downtime required for lifecycle management is mitigated using fast online upgrades, patching automation with service migration, standby first patching, zero downtime migration
Unplanned Outages
Data Corruptions due to hardware/software faults, media issues Data Corruptions are prevented or the potential downtime is reduced dramatically with additional corruption prevention, detection and auto-repair
Application Brownouts due to server, instance storage failures or due to planned maintenance
Application Brownout reduced to sub-second with fastest instance recovery.
Disaster Recovery (DR) Challenges where the DR site is not keeping up with Production
Disaster Recovery (DR) Challenges are mitigated with fastest redo apply resulting in low Recovery Time Objective
Copyright © 2020 Oracle and/or its affiliates.
Decades of Database Innovation Proven at Millions of Mission-Critical Deployments
Exadata : Hardware + Software + Database + Availability
9
ExadataDB Machine Innovations
Oracle Database
Innovations
Multitenant
In-Memory DB
Real Application Clusters
Active Data Guard
Partitioning
AdvancedCompression
Advanced Security, Label Security, DB VaultReal Application TestingAdvanced Analytics, Spatial and GraphManagement Packs forOracle Database
InfiniBand Fabric
Columnar Flash Cache
HCC
10:1I/O I/O I/O
Storage Indexes
Hybrid ColumnarCompressionI/O Resource Management
ExafusionDirect-to-Wire Protocol
Offload SQL to Storage
Network Resource Management
In-Memory Fault Tolerance
PCI FlashSmart Flash Cache, Log
Redundant Optimized HardwareCopyright © 2020 Oracle and/or its affiliates.
Oracle Exadata Advantage
Ideal Database HardwareLeading edge enterprise-grade components for maximumperformance and value
Smart System SoftwareDatabase-aware algorithms vastly improve the effectiveness of ALL workloads
AutomationAutomated infrastructure integrated with Oracle Autonomous Database
Identical On-Premises and CloudCopyright © 2020 Oracle and/or its affiliates.
Oracle Exadata Cloud Offerings
Core ExadataPlatform
In Data Center of Customer’s Choice
Exadata Cloud at Customer
Database PaaS Services
Flexible Subscription
Model
Oracle-Managed Exadata
Infrastructure
Cloud Security and Hardening
Secure Virtual
NetworksExadata Public Cloud Service
In Oracle Public Cloud Data
Centers
Copyright © 2020 Oracle and/or its affiliates.
What’s New
Gen 2 Exadata Cloud @ Customer
12
• Gen 2 public cloud manages Gen 2 Exadata Cloud at Customer– Eliminates additional control plane rack in customer data center – Simpler, lower cost, faster time to value
• New Exadata Cloud at Customer X8 hardware– Faster CPUs, more cores, more storage than ExaCC X7
• Simpler connectivity to customer network– Adapts to customer networking standards and requirements
• Now supports Oracle Database 19c– Long-term support for the 12.2 family
• Ready for Autonomous Database at Customer
Public Cloud UI and Management
Customer Data Center
Secure Tunnel
Runs the best database on the best platform in the best Cloud in your data center
Copyright © 2020 Oracle and/or its affiliates.
• Redundant Database Servers– Active-Active highly available clustered servers– Hot-swappable power supplies and fans– Redundant power distribution units– Integrated HA software/firmware stack
• Redundant Network– Redundant 40Gb/s IB connections and switches– Client access using HA bonded networks– Integrated HA software/firmware stack
• Redundant Storage Grid– Data mirrored across storage servers– Redundant, non-blocking I/O paths– Integrated HA software/firmware stack
13
Exadata: Built-in High Availability
Copyright © 2020 Oracle and/or its affiliates.
Oracle Maximum Availability Architecture (MAA)
Applying 30+ years of lessons learned in solving toughest HA problems around the worldSolutions to reduce downtime for planned & unplanned outages for Enterprise customers with most demanding workloads and requirementsService level oriented MAA reference architecturesBooks, white papers, blueprints MAA integrated Engineered SystemsContinuous feedback into products
High Availability, Disaster Recovery and Data Protection
14
Production Copy
DatabaseReplication
R
https://oracle.com/goto/maa
Copyright © 2020 Oracle and/or its affiliates.
Oracle Maximum Availability Architecture (MAA)
Reference Architectures
Deployment Choices
HA Features,Configurations &
OperationalPractices
Customer Insights & Expert Recommendations
Production Site Replicated Site
Platinum
Gold
Silver
Bronze
Replication
Data Protection
Flashback RMAN + ZDLRA
Continuous Availability
Application Continuity
Global Data Services
Generic Systems
Engineered Systems
DBCSExaCS/ExaCC
Autonomous DB
Active Replication
Active Data Guard GoldenGate
The picture can't be displayed.
Scale Out
RAC ShardingASM
15Copyright © 2020 Oracle and/or its affiliates.
Designed and Tested to Handle All Failure Scenarios
Exadata Maximum Availability Architecture
16
Best MAA Database Platform | Fastest RAC Instance and Node Failure Recovery | Fastest Backup - RMAN Offload to Storage Deep ASM Mirroring Integration | Fastest Data Guard Redo Apply | Complete Failure Testing with Lowest Brownouts
Frequently Updated Health Checks
Local standby for HA Failover
Redo-based change
replication with data consistency
checking
Online patching, reconfiguration, expansion
LAN WAN
Servers, Disks, Flash, Network,
Power
Active clusters, Disk/flash mirroring
Within Exadata Within a SiteRemote standby for Disaster Recovery
Across Sites
DATAB
ASE IN-M
EMO
RY
DATAB
ASE IN-M
EMO
RY
DATAB
ASE IN-M
EMO
RY
RedundantSoftware
Redundant Hardware
Redundant SystemsRedundant Databases
Redundant SystemsRedundant Databases
Copyright © 2020 Oracle and/or its affiliates.
18
Exadata MAA Evolution
On-Premises
On-Premises Exadata
Database / Exadata Cloud
Autonomous Database
• Infrastructure Management
• Architecture• Configuration, Tuning• Database Management• Lifecycle Operations• Application Performance
• Blueprints• Feedback to
products & features
• Infrastructure Management
• Architecture• Database Management• Configuration, Tuning• Lifecycle operations• Application Performance
• Blueprints• Exadata is the best
integrated MAA DB platform
• Architecture• Database Management (Tooling)• Configuration, Tuning • Lifecycle Operations (Tooling)• Application Performance
• Oracle owns and manages the best integrated MAA DB platform
• Cloud automation for provisioning and life cycle operations
• Choosing the SLA policy• Application performance
• Oracle owns and manages Infrastructure
• Policy driven deployments
• MAA Integrated cloud• Fully automated Self-
Driving, Self-Securing, Self-Repairing Database
CustomerOracle
Copyright © 2020 Oracle and/or its affiliates.
Configuration, Monitoring, Alerting and Management
Oracle Enterprise Manager Cloud Control (OEM)
• Exadata Database Machine• Data Guard / Active Data Guard• Multitenant• Zero Data Loss Recovery Appliance (ZDLRA)• Recovery Manager (RMAN)• Real Application Clusters (RAC)• Edition Based Redefinition (EBR)• Oracle Sharding• Oracle GoldenGate (OGG) – Monitoring and Alerting Only
19Copyright © 2020 Oracle and/or its affiliates.
Exadata Database Machine: Maximum Availability Architecture (MAA)
MAA Reference Architectures
Copyright © 2020 Oracle and/or its affiliates.
21
Reference Architectures – Level Set
Blueprints developed and certified by OracleValidated by 10,000s of Oracle Customers Capabilities carry forward as you progress from one tier to the nextAchieving stated service levels requires:
• Utilization of prescribed features and capabilities• Utilization of prescribed configuration and operational best practices• Due diligence during pre-production testing• Due diligence on all life cycle operations• Maintaining recommended patch levels and versions
Copyright © 2020 Oracle and/or its affiliates.
Oracle Maximum Availability Architecture(MAA) Solution Options
Copyright © 2020 Oracle and/or its affiliates.
Outage MatrixUnplanned Outage RTO / RPO*
Recoverable node or instance failure Minutes **
Disasters: corruptions and site failures Hours to days. RPO since last backup or near zero with ZDLRA
Planned Maintenance
Software/hardware updates Minutes **
Major database upgrade Minutes to hour
SingleInstance Database
Primary Availability Domain Secondary Availability Domain
Local Backup Replicated Backups
Dev, Test, Prod - Single Instance or Multitenant Database with Backups
• Single Instance with ClusterwareRestart
• Advanced backup/restore with RMAN
• Optional ZDLRA with incremental forever and near zero RPO
• Storage redundancy and validation with ASM
• Multitenant Database/Resource Management with PDB features
• Online Maintenance
• Some corruption protection
• Flashback technologies
BRONZE
Copyright © 2019 Oracle and/or its affiliates. 23
* RPO=0 unless explicitly specified** Exadata systems have RAC but Bronze Exadata configuration with Single Instance database running with Oracle Clusterware has highest consolidation density to reduce costs
24
Zero Data Loss Recovery Appliance in Your Data Center
Protected Databases
Protects all DBs in Data Center•Petabytes of data•Oracle 10.2-18c, any platform•No expensive DB backup agents
Delta Store•Stores validated, compressed data on disk•Fast restores to any point-in-time •Built on Exadata scaling and resilience•Enterprise Manager end-to-end control
Recovery Appliance
Replicates to Remote Recovery Appliance
Offloads Tape BackupDelta Push
•Send only Incremental changes and no more full backups
•Real-time transactions copied over for continuous data protection
Unified Management
Copyright © 2020 Oracle and/or its affiliates.
Fiber ChannelSAN
Tape library• Offsite Backups• Vaulting
InfiniBandNetwork
Storage Expansion Rack and X8-2 Extended (XT)• Fastest Backup and Restore• ILM Historical Archive• Second Disk Group• Expansion of DATA
10gigE or 25GigE
Backing up Exadata
Recovery Appliance• Delta Push & Backup Validation• Incremental Forever• Zero Data Loss Recoverability
Database BackupCloud Service• Offsite Storage• Low Cost
Public Network
Fiber ChannelSAN
IB,10GigE, or 25GigE
Media Server
Copyright © 2020 Oracle and/or its affiliates.
Prod/Departmental
SILVER
Bronze +• Real Application Clustering (RAC)• Application Continuity
Unplanned Outage RTO/RPO*
Recoverable node or instance failure Zero**
Disasters: corruptions and site failures Hours to days. RPO since last backup or near zero with ZDLRA
Planned Maintenance
Software/hardware updates Zero**
Major database upgrade Minutes to hour
Outage Matrix
RAC Database
Primary Availability Domain Secondary Availability Domain
Local Backup Replicated Backups
Copyright © 2020 Oracle and/or its affiliates. 27
Checklist found in MAA OTN https://www.oracle.com/technetwork/database/options/clustering/applicationcontinuity/adb-continuousavailability-5169724.pdf
* RPO=0 unless explicitly specified** To achieve zero, requires applying application checklist
Oracle Real Application Clusters (Oracle RAC)
• Utilizes two or more instances of an Oracle Database concurrently
• Very Scalable• All instances active; Add capacity online; Ideal for
database consolidation
• Highly Available• Auto-failover of services to an already running
instance; Outage is transparent to user, in-flight transactions succeed; Zero downtime rolling maintenance
Database Tier
ApplicationTier
Database Services
Primary Database
Node Failure, Instance Failure, Rolling Maintenance
28Copyright © 2020 Oracle and/or its affiliates.
Application does not see errors during outages
Transparent Application Continuity (TAC)
• Uses Application Continuity and Oracle Real Application Clusters
• Transparently tracks and records session information in case there is a failure
• Built inside of the database, so it works without any application changes
• Rebuilds session state and replays in-flight transactions upon unplanned failure
• Planned maintenance can be handled by TAC to drain sessions from one or more nodes
• Adapts as applications change: protected for the future
Request
Errors/Timeouts hidden
Transparent Application Continuity
29Copyright © 2020 Oracle and/or its affiliates.
Normal Operation
• Client marks requests: explicit and discovered.
• Server tracks session state, decides which calls to replay, disables side effects.
• Directed, client holds original calls, their inputs, and validation data.
Failover Phase 1:Reconnect
• Checks replay is enabled
• Verifies timeliness
• Creates a new connection
• Checks target database is legal for replay
• Uses Transaction Guard to guarantee commit outcome
Failover Phase 2:Replay
• Restores and verifies the session state
• Replays held calls, restores mutablesautomatically
• Ensures results, states, messages match original.
• On success, returns control to the application
Transparent Application Continuity Explained
30Copyright © 2020 Oracle and/or its affiliates.
Checklist for Achieving Zero Application Downtime1. Use Oracle Clusterware Service (never use default service)2. Use Recommended Connection String3. Configure FAN for Connection Pool4. Drain your service 5. Use Application Continuity or Transparent Application
Continuity
1) MAA Whitepaper: Application Checklist for Continuous Service for MAA Solutions2) Using RHPhelper to Minimize Downtime During Planned Maintenance on Exadata (MOS 2385790.1)
3. Fleet Patch and Provisioning incorporates MAA practicesCopyright © 2020 Oracle and/or its affiliates. 31
Copyright © 2020 Oracle and/or its affiliates.
Outage MatrixUnplanned Outage RTO/RPO*
Recoverable node or instance failure Seconds
Disasters: corruptions and site failures Seconds. RPO zero or seconds
Planned Maintenance
Software/hardware updates Zero
Major database upgrade Seconds
Primary Region Secondary Region
Local backup
Remote StandbyPrimaryLocal
StandbyLocal
backup
AD2 AD1
Mission Critical
Silver +• Active Data Guard• Comprehensive Data ProtectionMAA Architecture: • At least one standby required
across AD or region. • Primary in one data center(or AD)
replicated to a Standby in another data center
• Active Data Guard Fast-Start Failover (FSFO)
• Local backups on both primary and standby
GOLD
33
DG FSFO
RPO=0 unless explicitly specified** To achieve zero, requires applying application checklist
Storage Remote Mirroring Architecture
34
Generic - Must Transmit Writes to All Files
…. INCLUDING CORRUPTED BLOCKS OR BAD DATA
Oracle Instance (in memory)
Primary Database Mirrored Volumes
SYNC or ASYNCblock replication
• Zero Oracle validation• 7x network volume• 27x network i/o
Copyright © 2020 Oracle and/or its affiliates.
Inadequate isolation, zero application-level validation
Data Guard Addresses Shortcomings of Storage Replication
35
“…when something happens in the I/O stack and a database write is malformed Symmetrix A happily replicates the faulty data to site B and the corruption goes undetected”
EMC BLOG with Integrity
Copyright © 2020 Oracle and/or its affiliates.
Capability Physical Block Corruption Logical Block CorruptionDbverify, Analyze Physical block checks Logical checks for intra-block and
inter-object consistency
RMAN, ASM Physical block checks Intra-block logical checks
Active Data Guard
• Continuous physical block checking at standby• Strong isolation to prevent single point of failure• Automatic repair of physical corruptions• Automatic database failover (option for lost writes)
• Detect lost write corruption, auto shutdown and failover
• Intra-block logical checks at standby
Database In-memory block and redo checksum In-memory intra-block checks, shadow lost write protection
ASM Automatic corruption detection and repair using extent pairs
Exadata HARD checks on write, automatic disk scrub and repair HARD checks on write
Gold – Comprehensive Data Protection
Oracle Data Protection
36
Runt
ime
Man
ual
Copyright © 2020 Oracle and/or its affiliates.
37
Active Data Guard Overview
PrimaryOpen Read-Write
Standby Open Read-Only
Zero Data Loss at any Distance
Automatic Block Repair
Offload read only or read mostly workloads to the
standby database
• Synchronous zero data loss replication
• Database rolling upgrade to reduce downtime for planned maintenance
• Automatic failover for High Availability
DML Redirection
Multi-instance Redo Apply for RAC
(In Memory supported)
Copyright © 2020 Oracle and/or its affiliates.
Copyright © 2020 Oracle and/or its affiliates.
Gold +• GoldenGate Active/Active
Replication• Optional Sharding & Editions Based
Redefinition MAA Architecture: • Each GoldenGate “primary” replica
protected by Exadata, RAC and Active Data Guard
• Primary in one data center (or AD) replicated to another Primary in remote data center (or AD)
• Oracle GG & Editions Based Redefinition for zero downtime application upgrade
• Sharding for scalability and fault isolation
• Local backups on both sites• Achieve zero downtime through
custom failover to GG replica
Extreme Critical
PLATINUM Primary Region Secondary Region
Local backup
Local backup
AD2 AD1
GG Replication
AD1 AD2
Standby StandbyPrimary Primary
Outage Matrix
40
Unplanned Outage RTO/RPO*
Recoverable node or instance failure Zero**
Disasters including corruptions and site failures Zero***
Planned Maintenance
Most common software/hardware updates Zero**
Major database upgrade, application upgrade Zero***
RPO=0 unless explicitly specified ** To achieve zero, requires applying application checklist*** application failover is custom to failover to GG replica
Use Edition-based Redefinition
41
Use Oracle Sharding
GoldenGate plus 2 Optional Approaches to Further Protect Your Applications
Use Oracle Golden GateRequired Optional Alternative
Copyright © 2020 Oracle and/or its affiliates.
Oracle GoldenGate Microservices Architecture
42
SourceOracle & Non-OracleDatabase(s)
TargetOracle & Non-Oracle
Database(s)
Capture: committed transactions are captured (and can be filtered) as they occur by reading the transaction logs.
Trail: stages and queues data for routing.
Distribution Server/Receiver: distributes data for routing to target(s).
Route: data is compressed, encrypted for routing to target(s).
Capture
Delivery
TrailFiles Dist.
Service
TrailFiles
Delivery
Capture
Bi-directional
LAN / WAN / InternetOver TCP/IP
TrailFiles
TrailFiles
Delivery: applies data with transaction integrity.
Dist. Service
Receiver Service
Receiver Service
Copyright © 2020 Oracle and/or its affiliates.
Edition-Based Redefinition
• Enables application upgrades to be performed online• Code changes installed in the privacy of a new edition• Data changes are made safely by writing only to new columns or
new tables not seen by the old edition• An editioning view exposes a different projection of a table into
each edition to allow each to see just its own columns• A cross-edition trigger propagates data changes made by the old
edition into the new edition’s columns, or (in hot-rollover) vice-versa
Online Application Upgrade
43Copyright © 2020 Oracle and/or its affiliates.
Highly scalable, fault tolerant architecture for Internet Applications
Alternate Platinum Option: Sharding
44
• Custom Built Application optimized to use shard keys
• Horizontal partitioning of data across independent databases (shards)– Each shard holds a subset of the data– Can be single-node or RAC or PDB– Replicated for high availability
• Shared-nothing architecture:– Shards don’t share any hardware (CPU,
memory, disk), or software (Clusterware)
A single logical DB sharded into N physical Databases
Server1
Database
Table1Shard1
Server2
Table1Shard2
Server3
Table1Shard3
Copyright © 2020 Oracle and/or its affiliates.
Use Sharding with Active Data Guard, RAC or Oracle GoldenGate
Sharding Configuration Options
45
Active Data Guard with Fast-Start Failover
GoldenGate ‘chunk-level’ active-active replicationwith automatic conflict detection/resolution
Optionally – complement replication with Oracle RAC for server HA
https://www.oracle.com/database/technologies/high-availability/sharding.htmlCopyright © 2020 Oracle and/or its affiliates.
Maximum Availability Architecture (MAA)
MAA Features in Exadata
Copyright © 2020 Oracle and/or its affiliates.
47
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
48
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
Exadata: Data Protection
If an application update in the database encounters corruptionDatabase reads from the ASM mirrorRepairs the corruption using the good copyThis repair happens without impacting other database processes and application
When a network packet in the I/O path between DB server and storage node is corrupted
Storage cell prevents the writeASM retries by re-sending the packetApplication never encounters corruptions
When a drive is reported as failed, but not physically failedAutomatic power cycle the drive to avoid false positive drive failure
Works on both High Capacity & Extreme Flash Cells
49
Corruption Detection, Prevention & Repair
Exadata: Data Protection
If an application update in the database encounters corruption
Database reads from the ASM mirrorRepairs the corruption using the good copyThis repair happens without impacting other database processes and application
When a network packet in the I/O path between DB server and storage node is corrupted
Storage cell prevents the writeASM retries by re-sending the packetApplication never encounters corruptions
50
Corruption Detection, Prevention & Repair
Exadata: Data Protection
• When a drive is reported as failed, but not physically failedAutomatic power cycle the drive to avoid false positive drive failure
• Works on both High Capacity Disks & Extreme Flash Cells
• When a storage failure occurs,• Performs database-aware priority restores
• Control files, log files, SP files, TDE key stores, OCR, Wallets and then database files (MOS 1968607.1)
• With 12.2 and higher:• Redundancy restore after storage loss takes much less time
• New REBUILD phase done first which restores REDUNDANCY followed by restoring BALANCE
• Exadata flash cache leveraged for rebalance reads improving redundancy restoration performance by up to 30%
Storage Failures
51Copyright © 2020 Oracle and/or its affiliates.
Exadata: Data Protection
• Intelligent and flexible rebalance power setting• Testing in MAA labs to find best balance between
redundancy restoration timing and service level protection.• MAA best practice default of 4 (total across clusters) set at
deployment time• MAA best practice max of 64 (total across clusters) available
as needed• MOS note 757552.1 available with more information and
guidance
• Performs database-aware priority restores • Control files, log files, SP files, TDE key stores, OCR, Wallets and
then database files (MOS 1968607.1)
• 12.2+ ASM rebalance restores redundancy first drastically reducing secondary failure exposure window
• 12.2+ Exadata leverages flash cache for rebalance reads improving redundancy restoration performance by up to 30%
Efficient Rebalance with Service Level Protection
52
Copyright © 2020 Oracle and/or its affiliates.
Exadata ASM configuration best practicesExadata: Data Protection
Eighth and Quarter Rack ASM High Redundancy Voting Device /Quorum
Disk Best Practice CheckASM Power Limit Best Practice Check To
Ensure a Rebalance Will Run
We already store vote devices on each of the three storage cells
But now we also check for an iscsi device based quorum disk that can house an additional vote device on each database node
53Copyright © 2020 Oracle and/or its affiliates.
Do-Not-Service LED (X7 and higher)
• Data Center tech gets easy to use visual to prevent servicing that might cause an outage.
• Leverages ASMDeactivationOutcomecell attribute which is storage partner aware
54Copyright © 2020 Oracle and/or its affiliates.
• Two M.2 drives house operating system and cell software. Storage server hard disks and flash drives contain application data only.
• M.2 drives protected with Intel RSTe RAID
• Chassis can be opened and M.2 drive replaced online while the storage server continues to service the application
55
M.2 Fast Failure Protection and Online Replacement (X7 and higher)
USB Drive &
DBFS_DG by default
Copyright © 2020 Oracle and/or its affiliates.
“Hot-Plug”
Online Flash Replacement (X7 and higher)
56
• Chassis can be opened and the flash drives can be replaced online while the storage server continues to service the application.
• For a failed drive, replace when ready
• For an online drive, just tell us so we can properly prepare it. Example:
CellCLI> alter physicaldisk FLASH_2_2 drop for replacement;Copyright © 2020 Oracle and/or its affiliates.
57
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
For Optimal PerformanceExadata: Quality of Service
58
• Cell Side IO Latency Capping (Hard Disk & Flash )• When excessive IO is performed to a cell over PCI
• The read IO is redirected to the partner cell• The write IO is canceled and temporarily written to healthy flash on the
same cell
• Cell Side Disk Confinement• When a disk goes bad and is taken offline
• Diagnostic is automatically run on the disk to determine health
• If healthy, disk is returned to ONLINE status and re-synched
• If unhealthy, health factor drop is performed, rebalance is performed and blue LED is lit after completion
1
30
0
10
20
30
40
Exadata TraditionalStorage
Seco
nds
LGWR Delay after Hung IO
Copyright © 2020 Oracle and/or its affiliates.
Smart Storage with IO Resource Manager
Exadata: Quality of Service
59
• Each IO is tagged with who issued the IO, purpose & priority
• Enables mixed workloads, consolidation of many databases with multiple tiers of performance
Example IO Tasks Action Taken
Table scan from Critical Data Warehouse High-priority query. IORM prioritizes against other scans on both flash and disk!
Table scan read from an Ad-Hoc Query Low-priority, resource-intensive query. Stage to flash, only if there’s room. De-prioritize disk or flash I/O.
DBWR write - no threat of “free buffer wait”
Not urgent – plenty of free buffers. IORM de-prioritizes this I/O
DBWR write to resolve “free buffer wait” Urgent – users are blocked. IORM prioritizes this I/O
LGWR redo write High-priority I/O. Accelerated via Exadata Flash Log!
Buffer Cache read for OLTP transaction, PDB
Medium-priority I/O. Stage to flash. Prioritize against other user I/Os, based on resource plan.
Copyright © 2020 Oracle and/or its affiliates.
60
Smart Flash Replacement• After flash failure, a “health factor”
status is set on the set of hard drives backing the failed flash.
• Reads are satisfied from healthy partner cell instead of the cell with a reduced amount of flash
• Health factor status clears after flash replacement *and* cache warmup
• This feature reduces the application service level impact after flash failure
Copyright © 2020 Oracle and/or its affiliates.
SLAs Maintained During Planned Maintenance or When Storage is Compromised
Exadata: Quality of Service
• Exadata flash cache state preserved during ASM rebalance operations. One practical example is the resync that occurs during cell software rolling updates.
• Intelligent routing of IO requests to cell providing the best service after flash failure and repair
• Applicable to both unplanned outages and planned maintenance
Performance is Time Time is Money
61Copyright © 2020 Oracle and/or its affiliates.
Oracle Grid Infrastructure & Database 19cDatabase Tier IO Cancel
62
Database Tier
Storage Tier
Slow IO ?
Hung IO ?
Sick disk ?
Undiscovered hardware / software issue?
Cell IO Latency Capping
IO Hang detection / repair
Disk confinement
Database Tier IO Latency Capping
?IOs are PumpingIOs are PumpingIOs are PumpingIOs are PumpingIOs are Pumping
Copyright © 2020 Oracle and/or its affiliates.
Oracle Grid Infrastructure 19c Rebalance For High Redundancy Diskgroups
63
Problem: Rebalance runs out of space after disk failure (ORA-15041)Solution for 18c and lower
Run exachk which reports on compliance to our MAA best practice
Solution for 19c with high redundancy diskgroups
Smart rebalance - no need for free space!If there is not enough space to rebalance at the time of failure, offline the diskUpon replacement, efficiently repopulate it from partner disks automatically
15% free with a normal or high redundancy diskgroup having < 5 Exadata cells and GI versions 12.2 and 18c
0% free with 19c high redundancy diskgroup.
9% free with a normal or high redundancy diskgroup having 5 or more Exadata cellsand GI versions 12.2 and 18c
Copyright © 2020 Oracle and/or its affiliates.
64
Summary: Database Tier IO Cancel
• Protection from uncommon storage tier stalls/hangs
• Nothing! Completely transparent.
• Stable service level achieved through IO redirection on stalls/hangs
Feature Oracle Has Provided
Best Practices You Can Implement
Service Level ImpactExpectations
Copyright © 2020 Oracle and/or its affiliates.
65
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
Enterprise Manager 13c – Improved Plug-inExadata: Management
66
• Database consolidation workbench• Planning, Migration, Validation• Estimate IO bandwidth Savings with
consolidation
• Automated Patching• Lifecycle management of Exadata in
virtualized environment• Create/Delete RAC databases & VMs• Scale up/down cluster and VMs
• Exachk integration
Copyright © 2020 Oracle and/or its affiliates.
Notification & Replacement Process for any Faults
Exadata: Management
67
Fault Management
Components break Fully automated notification and replacement process through ASR (Auto Service Request)
Components get sick Exadata uniquely qualified to handle sick components with full stack integration. Exadata provides system/service level high availability.
Intelligent hardware/software integration helps prevent human error
Blue light indicating disk replacement can be performed. Cell shutdown prevention and notification when redundancy would be compromised. X7 Do Not Service LED
Cell Shutdown causing application outage
Smart handshake with database tier and proactive redundancy checks during cell (or cellsrv) shutdown to prevent application outage.
Copyright © 2020 Oracle and/or its affiliates.
Exadata: Management
• Shameless plug for the What’s New section of the “Exadata Database Machine System Overview” documentation• In Oracle Exadata Storage Software release 12.2.1.1.0, GetExaWatcherResults.sh generates
HTML pages that contain charts for IO, CPU utilization, cell server statistics, and alert history. The IO and CPU utilization charts use data from iostat, while the cell server statistics use data from cellsrvstat. Alert history is retrieved for the specified timeframe.
• Example on the next slide…
ExaWatcher Graphing Support
68Copyright © 2020 Oracle and/or its affiliates.
69Copyright © 2020 Oracle and/or its affiliates.
Exadata: Management
• EXAchk provides configuration specific, up-to-date health check across the entire stack• Covers Exadata, Database, Grid Infrastructure, ASM critical issues• Provides MAA scorecard with MAA configuration gaps and guidance to mitigate• Automated periodic scheduled runs with email notifications
• Continuous evolution of configuration checks• EXAchk helps with saving a lot of time and money due to proactive health verification which
dramatically reduces downtime• Currently has over 1000 checks per target• Development recommends that the latest EXAchk be executed with the following frequency:
• Monthly• Week before any planned maintenance activity• Day before any planned maintenance activity• Immediately after completion of planned maintenance activity or an outage or incident
Health Check using EXAchk Utility
70
Note: Automated Exachk Healthcheck MOS 107954.1
Copyright © 2020 Oracle and/or its affiliates.
71
MAA Score CardCritical Issues, incompatible features usage
EXAChk: Sample Reports
Assessment ReportHealth Score, Summary, Findings
Findings & Recommendations
How to Solve the problem?
Copyright © 2020 Oracle and/or its affiliates.
Diagnostic Pack
• A compressed archive containing logs, traces, relevant diagnostic information about the storage server and an index.html
• Contents customized to the particular incident• Outgoing email for the alert contains the diag pack, as well as a
link to it on the server.
72Copyright © 2020 Oracle and/or its affiliates.
Diagnostic Pack Example
Sick Component Handling
Copyright © 2020 Oracle and/or its affiliates.
One Stop Shopping for Performance Problems
Exadata AWR Support
74Copyright © 2020 Oracle and/or its affiliates.
Exadata AWR Support
• Configuration differences detected across storage servers• Exadata Storage Server Model• Exadata Storage Version (group by package type/package version)• Exadata Storage Information (group by all columns - flash cache size, flash log size,
# hard disks, # flash, # griddisks)• Exadata Griddisks (group by # griddisks, griddisk size and disk type)• Exadata Celldisks (group by disk type, celldisk size, # celldisks)
• Statistical differences detected compared to data sheet limits• Max IOPS/throughput for OS statistics are colored dark red• Outliers for OS and Cell Server statistics are colored (pinkish-red for high, yellow for
low)
Unique Configuration and Outlier Detection
75Copyright © 2020 Oracle and/or its affiliates.
Exadata AWR SupportOutlier Detection Example from a Real (Big) Customer
76Copyright © 2020 Oracle and/or its affiliates.
77
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
Exadata Gets New Technology at Best Cost Due to High Volume Server Economics
Server Centric Enables Leading-Edge Architecture
78
Get New and Fastest Processors First
Scale-out Storage Servers Enable Intelligence in Storage
Unified Ultra-fast InfiniBand
Ultra-fast NVMe PCIe Flash Get Fastest NVMe PCIe Flash First
PCI Flash
Tier PCIe Flash & Huge Disks Get Bigger Disk Drives First
Scale-Out with Fastest CPUs
Use Modern Ultra-Fast Networkfaults fault.chassis.device.fail
Copyright © 2020 Oracle and/or its affiliates.
79
Exadata: Performance
Exadata Specific H/W & S/W FeaturesExadata Smart Scan and Reverse Offload Exadata Smart LoggingExadata Smart Persistent Write Back Flash CacheExadata Persistent MemoryExadata Active/Active IB network
MAA FeaturesFastest Object ReorganizationFastest Instance RecoveryFastest Flashback CapabilityFastest Backups to Exadata, Recovery ApplianceFastest Active Data Guard & Standby Redo ApplyFastest GoldenGate Performance
Exadata Hardware + Exadata Software + Oracle Database provides the ultimate performance !!
Copyright © 2020 Oracle and/or its affiliates.
Exadata Smart System Software
Fastest AnalyticsUnique Smart Scan automatically offloads data intensive SQL operations to storageUnique Smart Flash Cache and Storage Index automatically accelerate database I/OUnique automatic conversion of data to fast In-Memory Columnar format in flash
Fastest OLTPFastest OLTP I/O with scale-out storage, RDMA, and NVMe flash Fastest scale-out with unique RDMA algorithms for inter-node cluster coordinationFully redundant and fastest recovery from failed or sick components
Best ConsolidationUniquely prioritizes latency sensitive or important workloads through full stackUniquely isolates workloads from multiple tenants through full stack
80Copyright © 2020 Oracle and/or its affiliates.
Exadata X8M (changes from X8 in red)
Scale-Out 2 or 8 Socket Database ServersLatest 24 core Intel Cascade Lake
100Gb RDMA over Converged Ethernet (RoCE) Internal Fabric
Scale-Out Intelligent 2-Socket Storage Servers1.5 TB Persistent Memory per storage serverThree tiers of storage: PMEM, NVMe, HDD
Enhanced consolidation using Linux KVM
Copyright © 2020 Oracle and/or its affiliates. 81
KVM
Full set of Exadata KVM best practices will be available here: https://www.oracle.com/database/technologies/high-availability/exadata-maa-best-practices.htmlSome MAA notables:
• The number of guests supported on KVM is 12 (8 on Xen)• Prior generations of Exadata can be connected via Data Guard or Golden Gate• Standard backup procedures apply
• The KVM host can optionally shapshot VM disk images and store them externally• Update core Exadata infrastructure with patchmgr• Update Grid Infrastructure and Database ORACLE_HOMEs with oedacli• Perform lifecycle operations with vm_maker and oedacli
Copyright © 2020 Oracle and/or its affiliates.
MAA Characteristics
82
Copyright © 2020 Oracle and/or its affiliates.
MAA Characteristics
PMEM
Hot
Warm
Cold
PMEM
FLASH
DISK
• Not drawn to scale • Primary copy of data placed in PMEM
cache on a read miss• Secondary copy of data placed in
flash cache on buffer eviction
Database Read
Buffer Evicted
If a pmem fails in Writethrough mode, no redundancy restoration is required
If a pmem fails in Writeback mode, a resilveroperation is run to restore redundancy
Low latency flash reads will repopulate super low latency pmem
Storage CellDatabase Node
Buffer Cache
Sizzling
Exadata Data Access Tiers
X
83
Active-Active ports in every RDMA Network Fabric Adapter2
2 RDMA Network Fabric Switches in every Exadata single rack
Copyright © 2020 Oracle and/or its affiliates.
22 Ports per switch used for internal cluster network, cabled ensuring no single point of failure exists
RDMA over Converged Ethernet (RoCE)
RDMA Network Fabric Adapter
RDMA Network Fabric Switch
RDMA Network FabricMAA Characteristics
RDMA paths exist between database nodes and cells to monitor cell liveliness.
If all four are unavailable after a short timeout expires, the cell is evicted
4
Copyright © 2020 Oracle and/or its affiliates.
Wait, in the past you have told me about how Exadata Fast Node Death Detection (FNDD) uses the InfiniBand Subnet Manager, but Exadata X8M does not have InfiniBand switches. How does FNDD work?
Second to complete cell eviction, maintaining SLA<1
X X X X
X
Database Node
Cell
RDMA Reads
Network Fabric Switch Software Updates
RDMA Network Fabric MAA Characteristics
• Same tool, patchmgr
• Separate software update package
• Optimized, built-in handling of port down/up events
• -verify-config and –roceswitch-precheck options available to check state ahead of time
Copyright © 2020 Oracle and/or its affiliates. 86
Multi-Instance Redo Apply Performance
• Utilizes all RAC nodes on the Standby database to parallelize recovery• OLTP workloads on Exadata show great scalability
Lower Latency Active Data Guard Standby Databases
190 380 7401480700
1400
2752
5000
0
1000
2000
3000
4000
5000
6000
7000
1 Instance 2 Instances 4 Instances 8 Instances
Batch
OLTP
StandbyApplyRate
MB/sec
87Copyright © 2020 Oracle and/or its affiliates.
Two Production Customer Examples
• Thomson-Reuters• Data Warehouse on Exadata, prior to write-back flash cache• While resolving a gap of observed an average apply rate of 580MB/second
• Allstate Insurance• Data Warehouse ETL processing resulted in average apply rate over a 3 hour
period of 668MB/second, with peaks hitting 900MB/second
Copyright © 2020 Oracle and/or its affiliates.
Data Guard Redo Apply Performance
89
90
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
Brownouts and Blackouts
A brownout is a significant service level degradation. A blackout is a complete service level interruptionBrownouts and blackouts translate to lost productivity and revenueSystems are complicated with many components, and an issue at one layer can easily cascade to another layer and exacerbate the impact.Engineered systems are uniquely qualified to solve this very tough problem.
Its All about Service Levels
91Copyright © 2020 Oracle and/or its affiliates.
Exadata Marquee/New HA Features Reduced HA Brownout – Fast Node Death Detection on Database Nodes and Cells
92
Example of Database node power failure with an OLTP workload and CSS misscount=60
Copyright © 2020 Oracle and/or its affiliates.
App Brownout in Typical Configuration
Each layer of the application stack has its own failure detection methodVendors try to obfuscate these details by quoting client side failure numbersIn most cases the fault detection times are additive
For example if storage controller crashes it will take 2 SCSI timeouts for the database server to detect such a failure
93
Storage Controller Storage Controller
SAN/LAN
ClusterwareTimeout
SCSI Timeout
Proprietary Protocol Timeouts
Copyright © 2020 Oracle and/or its affiliates.
Exadata: Unique Brownout Reduction Features
If a server disappears from bothInfiniBand switches, declare it dead in less than two seconds
No waiting for long heartbeat timeoutsReduces application brownouts from 30+ seconds to < 2 seconds
Active/Active IB configuration provides:Extreme throughput - 40 Gb/s QDRExtreme availability - RDS failover in seconds with minimum application impact
94
Instant Failure Detection Maximum Application Uptime
0.8
300
0
50
100
150
200
250
300
350
Exadata 3rd Party Storage
Application Brownout
Seco
nds
Copyright © 2020 Oracle and/or its affiliates.
Exadata: Brownouts and Blackouts
Let’s watch a 1 minute video featuring our Fast Node Death Detection (FNDD) feature. If you watch carefully you will still be rewarded with one new feature referenced at about the 35 second mark
Maintaining Service Levels Since 2008
For more information on older features and best practices, see http://www.oracle.com/technetwork/database/availability/exadata-maa-best-practices-155385.html , MOS note 757552.1, exachk reports, prior OOW MAA Exadata presentations, and the Exadata documentation.
95Copyright © 2020 Oracle and/or its affiliates.
Brownouts and Blackouts
Flex ASM enables continuous RDBMS<->ASM communication after an ASM instance crash without the need for a service failoverCompletely transparent to the application with no service level impact
Flex ASM
Flex ASM configured with cardinality=ALL on Exadata
98Copyright © 2020 Oracle and/or its affiliates.
Exadata: Brownouts and BlackoutsRecent Improvement in Brownout Associated With Network Port Failure
The brownout associated with active/passive client access network port failure is now 60% lower after we thoroughly verified a reduction to the network downdelay parameter. It also prevents false positive VIP failover when using OVM.This configuration change is now in the default Exadata deployment and the best practice check is in exachk.
99Copyright © 2020 Oracle and/or its affiliates.
Grid Infrastructure 12c or higher / Exadata 12.1 or higher
Smart Handshake For Storage Server Shutdown
100
• Clear communication to the diskmon process on the database servers when storage is shutdown prevents errors and application blackouts.
Your service level will smile!
Database Tier
Storage Tier
Copyright © 2020 Oracle and/or its affiliates.
Summary: Smart Handshake For Storage Server Shutdown
Graceful database tier handling during storage server shutdown
Use graceful shutdown procedures.Related: Use patchmgrfor storage server software updates as it ensures grid disks are handled properly.
No blackouts when storage tier is shutdown for maintenanceNo false positive errors/alerts
101
Feature Oracle Has Provided
Best Practices You Can Implement
Service Level ImpactExpectations
Copyright © 2020 Oracle and/or its affiliates.
19c Grid Infrastructure. Maintaining SLAs During Storage Failures
Smart OLTP Caching
102
X
• SaaS application reading data from the primary mirror
• Storage failure on cell containing primary mirror
• No problem, just retrieve data from the secondary mirror on flash with low latency
• The tertiary mirror continues to provide protection just in case its one of those days
• After the storage failure is repaired and the cell caching state is deemed healthy again, return to the primary mirror
Copyright © 2020 Oracle and/or its affiliates.
103
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
Exadata has Many HA Features Supporting the Most Stringent SLAsFast node and cell death detection
Fast network failure detection
Redundancy protection on cellsrv shutdown
Reduced brownout for instance recovery
ILOM hang detection and repair
Redundancy protection on cell shutdown
Automatic ASM mirror read on IO error corruption
IO error prevention with Exadata disk scrubbing / ASM corruption repair
Exadata HARD
Corruption prevention with HARD support
Elimination of false positive drive failures
Redundancy Check during power down
Blue OK-to-remove LED light notification
Active Active IB Network
Exadata Smart Write Back, Smart Flash Logging, Smart Scan and Reverse OffloadFastest Redo Apply and Instance Recovery
Efficient resilver rebalance after flash failure
I/O latency capping for reads and writes
Cell IO timeout threshold
Smart Write Back Flash Cache persistence
I/O and Network Resource Management
Health factor on predicatively failed disks
Disk confinement
IO hang detection and repair
Cell to Cell offload for Disk Repair
Cell-to-Cell Rebalance Preserves Flash Cache
Exadata Elastic Configuration
Drop hard disk for replacement
Drop BBU for Replacement
Appliance mode support
Cell Alert Summary
Flash and Disk Life Cycle Management Alerts
Automatic LED support for disk removal
Auto online
Auto disk management
Priority rebalance support
EM failure reporting
Failure Monitoring on database servers
Updating database nodes with patchmgr
Optimized and Faster Exadata Patching
Custom Diagnostic Package for Cell Alerts
VLAN support and automation
Exachk – full stack health check with critical issues alerts
104Copyright © 2020 Oracle and/or its affiliates.
Exadata MAA Benefits
Pre-Packaged MAA
Faster deployment*,
Reduced guess-work &
tuning requirements
Reduced Downtime
Few seconds of Blackout /
Brownout
Corruption Prevention &
Repair
Zero downtime
using corruption prevention
HA Quality of Service
Meeting HA SLAs at any
scale
Reliable & Scalable
Performance
Reliable network &
storage performance at any scale
End-to-End Management
Integrated tools/reports with end-to-end visibility
105
* Pre-Deployed in ExaCS / ExaCS
Copyright © 2020 Oracle and/or its affiliates.
106
Exadata MAA Solution Integration
All Exadata MAA configuration best practices baked in
Exadata MAA operational best practices
implemented by customer
Gen 2 Exadata Cloud at Customer
On-Premises Exadata
All Exadata MAA configuration best practices baked inSome Exadata MAA operational best practices baked in
All Exadata MAA configuration best practices baked inAll Exadata MAA
operational best practices baked in
Gen 2 Exadata Cloud / Autonomous Database
Copyright © 2019 Oracle and/or its affiliates.
107
Exadata: Maximum Availability Architecture Features
Data Protection
Quality of Service
ManagementPerformance
Brownout Reduction
Code & Configuration
Copyright © 2020 Oracle and/or its affiliates.
High Availability for Maximum Application Uptime
Only other AL4 Systems• IBM - z Systems• HPE - Integrity NonStop &
Superdome• Fujitsu – GS & BS2000• NEC – FT Server/320 Series• Stratus ftServer & V Series• Unisys – Dorado
“Exadata and SuperCluster both achieve AL4 fault
tolerance in a Maximum Availability Architecture
configuration”
FIVE NINES
5X999.999%
A New Gold Standard
108Copyright © 2020 Oracle and/or its affiliates.
IDC Report*
Exadata Delivers Real Business Value
Average results across eight Global 2000 companies
• Five-Year ROI: 429%• 11 month average payback
• 94% less unplanned downtime
“…far less complicated, we don’t have distinct boxes to maintain, and we now have a single technology...”
— Oracle CustomerIDC: Business Value of Oracle Exadata Database Machine
September 2016
* IDC White Paper, Sponsored by Oracle, September 2016
109Copyright © 2020 Oracle and/or its affiliates.
Oracle Exadata Database MachineRisk Mitigation – Downtime
Before Oracle Exadata
With Oracle Exadata Difference % Benefit
Unplanned Downtime
Number of instances per year 7.1 0.7 6.5 90%
MTTR (hours) 2.9 0.4 2.5 86%
Productive hours lost per 100 users per year 1,021 66 955 94%
Unplanned Downtime – Revenue Impact
Total revenue impact per year $423,700 $5,800 $417,900 99%
Planned downtime
Number of instances per year 10.9 6.0 4.9 45%
MTTR (hours) 4.6 1.9 2.7 59%
Productive hours lost per 100 users per year 68 60 8 12%
Source: IDC110Copyright © 2020 Oracle and/or its affiliates.
Maximum Availability Architecture (MAA)
Summary
Copyright © 2020 Oracle and/or its affiliates.
Less Risk, High Uptime = Better Results
Exadata is Highly Engineered and Standardized
• Less Deployment Risk and Faster to Market• Delivered assembled, debugged, and ready-to-run
• Less Performance and Availability Risks• Optimized database-to-disk including firmware, OS, network
• Industry experts at every layer of the stack help design, build and support Exadata. Includes MAA input, bug fixes, and configuration practices.
• Less Operating Risk• All failure modes tested end-to-end. All systems identical.
• Reduces issue resolution times, reduces vendor management overhead and improves SLAs
• Operational Play Book (including online elasticity)112Copyright © 2020 Oracle and/or its affiliates.
Best Mixed Workload Performance, Performance Isolation, Availability
Exadata for Consolidation and Database as a Service
113
• Any bottleneck on consolidated system can stall all workloads. Exadata eliminates bottlenecks – Highest network bandwidth, storage offload– Millions of I/Os per second, unique log optimizations
• Exadata uniquely prioritizes I/O by pluggable database, job, user, service, etc.
• Exadata uniquely prioritizes critical DB network messages through entire fabric
• Exadata uniquely unifies CPU prioritization with I/O prioritization for end-to-end assurance
Manufacturing
Marketing
Human Resources
Engineering
Sales
Service
IT/Operations Finance and Accounting
Copyright © 2020 Oracle and/or its affiliates.
Half OLTP - Half Analytics - Many Mixed
Exadata + MAA: Thousands of Critical Deployments
114
• Petabyte Warehouses• Online Financial Trading• Business Applications
• SAP, Oracle, Siebel, PSFT, …
• Massive DB Consolidation• Public SaaS Clouds
• Oracle Fusion Apps, Salesforce, SAS, …
4 OF THE TOP 5BANKS, TELCOS, RETAILERS RUN EXADATA
Copyright © 2020 Oracle and/or its affiliates.
Exadata Advantages Increase Every Year
• Smart Scan• InfiniBand Scale-Out
• Database Aware Flash Cache• Storage Indexes• Columnar Compression
• IO Priorities• Data Mining Offload• Offload Decrypt on Scans
• In-Memory Fault Tolerance• Direct-to-wire Protocol• JSON and XML offload• Instant failure detection
• Network Resource Management• Multitenant Aware Resource Mgmt• Prioritized File Recovery
• Unified InfiniBand
• Scale-Out Servers
• Scale-Out Storage
• DB Processors in Storage
• PCIe NVMe Flash• Tiered Disk/ Flash
• Software-in-Silicon
• 3D V-NAND Flash
• In-Memory Columnar in Flash• Exadata Cloud Service• Smart Fusion Block Transfer
• Exadata Cloud at Customer• In-Memory OLTP Acceleration
• Hot Swappable Flash
• 25 GigE Client Network
Dramatically Better Performance and Cost
Copyright © 2020 Oracle and/or its affiliates. 115
116
Exadata Combinations with Other Engineered Systems
Exadata Database Machine
ZFS Backup Appliance
Zero Data Loss Recovery Appliance
Data Protection
Big Data Appliance
Big Data SQL
Exalogic Elastic Cloud
Private Cloud Appliance
Middleware / Apps
Oracle Database Appliance
Dept
SuperCluster
Oracle MiniCluster
Copyright © 2020 Oracle and/or its affiliates.
117
Summary: High Availability Decisions Made Easier
Protection From Use Exadata + MAA
Unintuitive double storage failures High redundancy or Normal redundancy with Data Guard
Data loss and downtime For disasters: Data Guard, Golden Gate -> See http://www.oracle.com/goto/maaFor local failures requiring recovery: Test your restore/recovery strategy to ensure it works
Unexpected issues during planned maintenance
If you can negotiate the downtime within service levels, take it. If not, leverage rolling patch capabilities available at every tier
Unexpected production workload profile Test environment similar to production, DBMS_WORKLOAD_REPLAY
Known critical issues Run EXAchk monthly and when a new release is published
Resource depletion that affects service levels
Capacity Planning, Resource Management, Enterprise Manager, RAS for new customers
Over-customization Walk the Oracle line as much as you can, and you will gain the most bang for your buck from your engineered system
Copyright © 2020 Oracle and/or its affiliates.