maximum availability on private clouds · maximum availability on private clouds ......
TRANSCRIPT
Maximum Availability on Private CloudsAmmar Fayoumi – Senior Presale Consultant
Oracle
Oct, 19, 2011
Egypt
Cairo
2
No Time for Downtime?
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
3
From Barron’s Online
…their Web site has been down for hours. Around noon Pacific time, the site said it would be back up at 1 P.M. Pacific; a few minutes ago it said the site will be up again at 4 p.m. And now there is no time reference at all. The site now says:
The web site is temporarily unavailable.We apologize for any inconvenience this causes you.
Please visit us again soon.You can contact Customer Service at 1-888-xxx-xxxx
Note: The company’s stock price fell 6% on the same day
Twenty Two Hour OutagePopular E-Commerce Site
4
• Site outage due to large-scale disasters
– Fire, floods, hurricanes, earthquakes . . .
• Local outages that occur more frequently
– Faulty system components
– Data corruptions
– Backup/recovery of bad data
– Wrong batch job
– Operator errors
– Planned maintenance
– Bad HW/SW installations, upgrades …
Data Protection & High Availability (HA)Universal Requirements for Today’s Business
5
• Relies on idle redundancy
– Active/passive server failover
– Idle disaster recovery servers & storage
• Multiple 3rd party components to integrate
Traditional High AvailabilityIdle Redundancy
Idle Failover
Server
Idle
Storage Arrays
Idle Disaster
Recovery
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
6
Fast Recovery Area
Active Data Guard
Oracle Maximum Availability ArchitectureNo idle redundancy
Automatic Storage Management
Real Application Clusters
Data Guard
Secure Backups to Cloud and Tape
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Primary Database Standby Database
7
Oracle Maximum Availability ArchitectureRemove the need for planned downtime
Add/Remove Storage
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Table & Index Redefinition
Undo Human Error
Add/Remove Nodes
Rolling Patches & PSUs
Rolling Upgrades
Automated Upgrade Testing
Online Application Upgrade
8
Evolving Computing EnvironmentsMoving to Cloud Computing
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
9
• An extensive series of
technical best practices
to ensure maximum
availability in con-
solidated environments
and in the private cloud.
Maximum Availability Architecture (MAA)Architecture and Best Practices for the Private Cloud
Ref. http://www.oracle.com/goto/maa
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
10
Evolution Towards MAAStart with a Single Server
Database Instance
Database Storage
Oracle benefits:
• Integrated corruption detection
• Integrated volume management (ASM)
• Optimum I/O performance
• Provision or migrate storage online
• Automatic storage tuning
• Integrated disk mirroring
• Integrated Oracle backups (RMAN + OSB)
• Block validation
• Online block-level recovery
• Unused block compression
• Online, multi-streamed backup
• Native encryption
• Multiple compression levels
• Fast backups to disk, tape, cloud (Amazon S3)
Oracle Database
Automatic Storage Management (ASM)
Recovery Manager (RMAN)
Oracle Secure Backup (OSB)
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
11
Vulnerable to downtime and data loss:
• Server failures
• Database instance crashes
• Administrator errors
• Data corruptions
• Network outages
• Site failures
• Most planned maintenances
• All users impacted by any outage
Evolution Towards MAAStart with a Single Server, but …
Database Instance
Database Storage
CRASH
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
12
Protection from some downtime
• Unplanned: – Server failures
– Instance crashes
• Planned– Online relocation of Oracle instance
– Rolling upgrades for maintenance & patches
Risks of other downtime remain: • Administrator errors
• Data corruptions
• Network outages
• Site failures
• Patch set and database upgrades
• Users impacted by many outages
• Unused server resource, till failure occurs
Evolution Towards MAA … contd.Protect Instance with RAC One Node
Database Instance
Database Storage
Active Passive
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
13
Real Application Clusters (RAC) –
best solution for Server HA:
• Protection from server failures
• Protection from database instance crashes
• Enables rolling patch upgrades
• All instances / servers are active
• Scale throughput by simply adding servers online
• Automated workload management
Still need to eliminate downtime from:
• Administrator errors
• Data corruptions
• Network outages
• Site failures
• Patch set and database upgrades
Evolution Towards MAA … contd.Enhance Server Scalability & Availability with RAC
Database Instance
Database Storage
RAC: All Servers Active
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
14
Oracle Real Application ClustersWhat customers think…
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
15
FlashbackDatabase
FlashbackTransaction
FlashbackTable
FlashbackQuery
Fast recovery from human errors
• Oracle-integrated Continuous Data Protection (CDP)
• Operates only on changed data in a highly optimized manner
• Reduces correction time from hours to minutes
• Correction Time = Error Time + f(DB_SIZE)
• Simple commands instead of complex procedure
Evolution Towards MAA … contd.Eliminate Administrator Errors with Flashback
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
16
Evolution Towards MAA … contd.Disaster Protection: How about Storage Mirroring?
Database Instance
Database Storage
Primary Site: RAC - All Servers Active DR Site: All Servers Inactive
Storage Mirroring
Storage mirroring: Redundant storage protects from storage failures, but:
• No protection from physical data corruptions
• DR systems offline during mirroring – no real-time data validation
• Distance limited, storage vendor lock-in, manual failover, no rolling upgrades, high network use
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
17
Log Buffer
OnlineLogs
fil
Archive
Logs
Flashback
Logs
Control
Files
Data
Files
SYSTEM
USER
TEMP
UNDO
Primary Volumes Target Volumes
Network I/OOracle Apply
& Validation7X more
network volume
27X more network I/Os
Zero Oracleawareness
Poor isolation
Idle standbysystems
Example: Storage Mirroring WeaknessHigh Usage of Network Resources
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
18
Evolution Towards MAA … contd.Integrated HA & DR with RAC and Active Data Guard
Database Instance
Database Storage
Primary Site: RAC - All Servers Active Standby Site: All Servers Read-Only Active
Data Guard
Data Guard: Oracle-integrated disaster recovery & data protection solution:
• Protection from storage, site, network failures and data corruptions
• Reporting, queries, backups offloaded to DR systems – supports real-time data validation
• No distance limitation, storage agnostic, automatic failover, rolling upgrades, optimized network use
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
19
Log Buffer
OnlineLogs
fil
Archive
Logs
Flashback
Logs
Control
Files
Data
Files
SYSTEM
USER
TEMP
UNDO
Primary Database Standby DatabaseNetwork I/O
Oracle Apply
& Validation
End-to-endvalidation
Storage agnostic
Automatic block repair
Real-time reporting
Oracle-aware physical replication
Strong isolation
Example: Data Guard StrengthOptimized Usage of Network Resources
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
20
Data center moves
Technology refresh
32 bit to 64 bit
Windows to Linux
AIX 64 bit to Solaris Sparc
Migrate to RAC
Migrate to ASM
Migrate to Exadata
System maintenance
Database rolling upgrades
Index and storage changes
Implement Advanced
Compression
Migrate to SecureFiles
Test new features
See My Oracle Support Note 413484.1 for details
Data Guard: Rolling MaintenanceMinimize Planned Downtime
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
21
Protection against
downtime and data loss:
Server failures
Database instance crashes
Storage subsystem failures
System induced data corruptions
Administrator errors
Network outages
Site failures
System maintenance
One-off patches and CPUs
Database patch-sets and upgradesOracle Database, ASM, RMAN, Oracle
RAC, Flashback, Data Guard, Enterprise Manager
PrimaryDatabase
ActiveStandby
Database
Open read-write Open read-only
Oracle Enterprise Manager
Data Guard
Evolution Towards MAA … contd.Scorecard so far
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
22
• Database rolling upgrades from pre-Oracle Database 10g
• Flexible upgrade strategies that incorporate active-active
multimaster replication
• Cross-endian platform migrations
• Migration from non-Oracle databases to Oracle
• A number of other cross platform migrations
• Application upgrades
Planned MaintenanceNot Yet Addressed Completely
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
23
RAC, Data Guard, GoldenGateIntegrated HA, DR and Active-Active Replication
GoldenGate
- Information Distribution
- Heterogeneous
Bi-directional
Replication Subsetting MySQL
Standby
Database
Active Data Guard
- DR & Data Protection
- Real-time Query
Primary
Database
RAC
- Scalability
- Server HA
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
25
Oracle Maximum Availability ArchitectureLow-cost, Integrated, Fully Active, High ROI
Active Data Guard–Data Protection, DR
–Query Offload
GoldenGate–Active-active
–Heterogeneous
Oracle Secure Backup–Backup to tape / cloud
Active Replica
Edition-based Redefinition,
Data Guard, GoldenGate– Minimal downtime maintenance, upgrades, migrations
RAC–Scalability
–Server HA
Flashback–Human error
correction
Production Site
ASM–Volume Management
RMAN & Fast Recovery Area–On-disk backups
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
27
Outage Oracle Solution (s) Downtime
Server failure Oracle RAC, Clusterware Zero
Storage failure Automatic Storage Management (ASM) Zero
Database and site failure Oracle Data Guard < 60 seconds
Data corruptions Oracle Data Guard Zero to < 60 seconds
Human Errors Oracle Flashback Technologies 80x faster than restore
MAA RecommendationsProtection from Unplanned Outages
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
28
Planned Event Oracle Solution(s) Downtime
Operating system and hardware
maintenance, add/remove cluster
nodes or storage
Oracle RAC, Clusterware, ASM Zero
Oracle one-off patches, critical
patch updates, file system and
clusterware upgrades
Oracle RAC, Clusterware, ASM Zero
Site maintenance, cluster-wide
maintenanceOracle Data Guard, GoldenGate Minimal or zero
Oracle patch-set and full Oracle
release upgradesOracle Data Guard, GoldenGate Minimal or zero
Platform Migrations Oracle Data Guard, GoldenGate Minimal or zero
Application upgrades Edition-based redefinition, GoldenGate Zero
MAA RecommendationsMinimizing Planned Downtime
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
29
Resources
• Database High Availability
– oracle.com/ha
• High Availability Best Practices
– oracle.com/goto/maa
• Active Data Guard Hands-On Lab– oracle.com/technetwork/database/features/availability/data-guard-hol-176005.html
– Same experience as on-site lab at Oracle OpenWorld 2010
– Linked from both the above two portals
© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
30© Copyright © 2011, Oracle and/or its affiliates. All rights reserved.