journey through the cloud - disaster recovery

69
Journey Through the Cloud [email protected] @IanMmmm Ian Massingham — Technical Evangelist Disaster Recovery

Upload: amazon-web-services

Post on 16-Apr-2017

1.949 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Journey Through The Cloud - Disaster Recovery

Journey Through the Cloud

[email protected]@IanMmmm

Ian Massingham — Technical Evangelist

Disaster Recovery

Page 2: Journey Through The Cloud - Disaster Recovery

Journey Through the Cloud

Learn from the journeys taken by other AWS customers

Discover best practices that you can use to bootstrap your projects

Common use cases and adoption models for the AWS Cloud123

Page 3: Journey Through The Cloud - Disaster Recovery

Disaster Recovery

Explore and learn about AWS with a ‘non-production’ use case Phase systems into ‘live’ DR use with reduced risk

Benefit from lower costs & only pay for what you useGain the ability to test DR procedures more frequently

Invoke DR whilst testing DR procedures if necessary

Page 4: Journey Through The Cloud - Disaster Recovery

Agenda

Why AWS for disaster recovery?AWS services that are relevant for DR use-cases

Common DR architecturesCustomer case studies and examples

Resources to learn more

Page 5: Journey Through The Cloud - Disaster Recovery

Using AWS for DR Provision

https://aws.amazon.com/solutions/case-studies/sunpower/

Page 6: Journey Through The Cloud - Disaster Recovery

Business & Technical Drivers for DR in the Cloud

▶︎ Minimise costs

▶︎ Reduce on-premises infrastructure

▶︎ Consolidate sites

▶︎ Remove ageing technonologies

Page 7: Journey Through The Cloud - Disaster Recovery

DR & Business ContinuityDR forms part of a wider set of policies & controls

High availability Backup Disaster recovery

Keep your applications running 24x7

Make sure your data is safe Get your applications and data back after a major

disaster

Page 8: Journey Through The Cloud - Disaster Recovery

I T ’ S N O T B I N A R Y

Page 9: Journey Through The Cloud - Disaster Recovery

DR & Business Continuity

Recovery Time Objective (RTO)

Recovery Point Objective (RPO)

How quickly I need this service to be recovered 1 minute? 15 minutes? 1 hour? 4 hours? 1 day?

How much data loss can be tolerated? Zero data loss? 15 minutes out of date?

Each application or service will have specific requirements

Page 10: Journey Through The Cloud - Disaster Recovery

DR & Business Continuity

Custo

mer fac

ing

trans

actio

nal w

eb ap

plica

tion

Intern

al co

llabora

tion s

ystem

Daily

sche

duled

proc

esse

s

& sy

stems

Back

end r

eport

ing sy

stem &

datab

ase

Applications can be placed on a spectrum of complexity…

Rebuild when required from offsite backup

Run hot-hot configuration with auto-failover

Page 11: Journey Through The Cloud - Disaster Recovery

The Utility, On-Demand Data Centre

Primary Site Routers Firewalls Network

Application Licenses Operating Systems

Hypervisor Servers

SAN fabric Primary Storage

Backup Archive

Secondary Site Routers Firewalls Network

Application Licenses Operating Systems

Hypervisor Servers

SAN fabric Primary Storage

Backup Archive

Page 12: Journey Through The Cloud - Disaster Recovery

The Utility, On-Demand Data Centre

Primary Site Routers Firewalls Network

Application Licenses Operating Systems

Hypervisor Servers

SAN fabric Primary Storage

Backup Archive

AWS Region Routers Firewalls Network

Application Licenses Operating Systems

Hypervisor Servers

SAN fabric Snapshot Storage

Backup Archive

Page 13: Journey Through The Cloud - Disaster Recovery

The Utility, On-Demand Data Centre

Primary Site Routers Firewalls Network

Application Licenses Operating Systems

Hypervisor Servers

SAN fabric Primary Storage

Backup Archive

AWS Region Routers Firewalls Network

Application Licenses Operating Systems

Hypervisor Servers

SAN fabric Snapshot Storage

Backup Archive

Secondary site costs

Page 14: Journey Through The Cloud - Disaster Recovery

11 regions 28 availability zones 51 edge locations

AWS Global Footprint

https://aws.amazon.com/about-aws/global-infrastructure/

Page 15: Journey Through The Cloud - Disaster Recovery

AWS security approach

Size of AWSsecurity team

Visibility intousage & resources

Increasing your Security Posture in the Cloud

https://aws.amazon.com/security

Page 16: Journey Through The Cloud - Disaster Recovery

Broad Accreditations & Certifications

https://aws.amazon.com/compliance

Page 17: Journey Through The Cloud - Disaster Recovery

Partner ecosystem Customer ecosystem Everyone benefits

Security Benefits from Community Network Effect

Page 18: Journey Through The Cloud - Disaster Recovery

RELEVANT AWS SERVICES

Page 19: Journey Through The Cloud - Disaster Recovery

Object Storage & Transfer Services

Amazon S3 AWS Import/Export AWS Storage Gateway

Page 20: Journey Through The Cloud - Disaster Recovery

AWS Import/Export Disk

AWS Import/Export Disk

https://aws.amazon.com/importexport/disk/details/

Accelerates moving large amounts of data Uses portable storage devices for transport Often faster than internet transfer for large data sets Supported regions: US East (N. Virginia), US West (Oregon), US West

(Northern California), EU (Ireland), and Asia Pacific (Singapore)

Page 21: Journey Through The Cloud - Disaster Recovery

AWS Import/Export Snowball

https://aws.amazon.com/importexport/

A single Snowball appliance can transport up to 50 terabytes of data

Page 22: Journey Through The Cloud - Disaster Recovery

Using AWS Storage Services for DRAmazon S3 & Amazon Elastic Block Store

Simple Storage Service Highly scalable object storage

1 byte to 5TB in size

99.999999999% durability

Elastic Block Store High performance block storage device

Volumes from 1GB to 16TB in size

Snapshot/cloning functionalities

Page 23: Journey Through The Cloud - Disaster Recovery

Networking & Connectivity Services

AWS Direct Connect Amazon Virtual Private Cloud (VPC) Amazon Route 53

Page 24: Journey Through The Cloud - Disaster Recovery

Connecting to AWS

VPN Connectio

VPN Connection

Amazon VPCYour premises/network

AWS Resources

VPC VPN Gateway

VPN Connection

Customer Gateway

Your Resources

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpn-connections.html

Page 25: Journey Through The Cloud - Disaster Recovery

Connecting to AWSDirect Connect

Amazon VPCYour premises/network

AWS ResourcesDirect Connect

Your Resources

https://aws.amazon.com/directconnect/

Page 26: Journey Through The Cloud - Disaster Recovery

Foundation Services

Amazon EC2 Amazon Relational Database Service (RDS)

Amazon Elastic Block Storage (EBS)

Page 27: Journey Through The Cloud - Disaster Recovery

COMMON ARCHITECTURESFOR DISASTER RECOVERY

Page 28: Journey Through The Cloud - Disaster Recovery

Common Architectures for Disaster Recovery4 Main Patterns

Backup & Restore Pilot light

Warm standby in AWS Multi-site solution AWS & on-premises

Page 29: Journey Through The Cloud - Disaster Recovery

Common Architectures for Disaster RecoveryWe’ll focus on two, starting with Backup & Restore

Backup & Restore Pilot light

Warm standby in AWS Multi-site solution AWS & on-premises

Page 30: Journey Through The Cloud - Disaster Recovery

Store backup data in the AWS Cloud

Page 31: Journey Through The Cloud - Disaster Recovery

Store AMIs for server operating system images

Page 32: Journey Through The Cloud - Disaster Recovery

Recover servers during DR testing or invocation

Page 33: Journey Through The Cloud - Disaster Recovery

Backup & Restore PatternAdvantages from starting here…

Simple to get started

Easy starting point for exploring the AWS cloud

Low technical barrier to entry

Focus on incorporating cloud into your DR strategy, not on complex technical issues

related to hot-hot systems

Cost effective

Very high levels of data durability at low price

Cost of storing snapshots in S3

Archiving possibilities beyond tape using Glacier

Page 34: Journey Through The Cloud - Disaster Recovery

Backup & Restore PatternGetting started…

Take backups of configuration state & data

Store Backups in Amazon S3

Move to long term archive in Glacier

Page 35: Journey Through The Cloud - Disaster Recovery

Backup & Restore PatternOptions…

Gateway Backup Appliance Direct Access to Amazon S3

AWS Storage Gateway

Page 36: Journey Through The Cloud - Disaster Recovery

Amazon S3Standard Standard - Infrequent Access Glacier

https://aws.amazon.com/s3/

Page 37: Journey Through The Cloud - Disaster Recovery

Amazon S3Standard - Infrequent Access

https://aws.amazon.com/s3/storage-classes/

Page 38: Journey Through The Cloud - Disaster Recovery

Amazon Glacier

https://aws.amazon.com/glacier/

Page 39: Journey Through The Cloud - Disaster Recovery

Amazon Glacier

Durable Designed for 99.999999999%

durability of archives

Cost Effective Write-once, read-never. Cost effective for long

term storage. Pay for accessing data

https://aws.amazon.com/glacier/

Page 40: Journey Through The Cloud - Disaster Recovery

Logsaccessible from S3

logs

Expi

ry

time

Page 41: Journey Through The Cloud - Disaster Recovery

logs✗Objects expire and are deleted

Logsaccessible from S3Ex

piry

time

Page 42: Journey Through The Cloud - Disaster Recovery

Txns

Object transition to Glacier invoked

Logs logs✗Objects expire and are deleted

accessible from S3

accessible from S3

Expi

ryTr

ansit

ion

time

Page 43: Journey Through The Cloud - Disaster Recovery

Restoration of object requested for x hrs

Logs logs✗Objects expire and are deleted

accessible from S3

accessible from S3Txns

Expi

ryTr

ansit

ion

Object transition to Glacier invoked

time

Page 44: Journey Through The Cloud - Disaster Recovery

time

3-5hrs

Object held in S3 RRS for x hrs

Expi

ryTr

ansit

ion

Logs logs✗Objects expire and are deleted

accessible from S3

accessible from S3Txns

Object transition to Glacier invoked

Restoration of object requested for x hrs

Page 45: Journey Through The Cloud - Disaster Recovery

Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway

AWS Storage Gateway installed

on-premise to synchronize local

volumes

https://aws.amazon.com/storagegateway/

Page 46: Journey Through The Cloud - Disaster Recovery

Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway

Local volumes created under

Storage Gateway

Page 47: Journey Through The Cloud - Disaster Recovery

Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway

Usable with on-premise servers

via iSCSI interface

Page 48: Journey Through The Cloud - Disaster Recovery

Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway

Primary on-premise volumes

snapshotted, compressed and stored in Amazon

S3

Page 49: Journey Through The Cloud - Disaster Recovery

✕Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway✕

Page 50: Journey Through The Cloud - Disaster Recovery

Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway

Snapshot pulled from S3 to restore local

volume

Page 51: Journey Through The Cloud - Disaster Recovery

Storage Gateway

Corporate Data Center Elastic Data Center

AWS Storage Gateway

Snapshot pulled from S3 to create cloud

instance backed by

Volume

Page 52: Journey Through The Cloud - Disaster Recovery

Gateway stored volumes

Data stored locally Asynchronous backup

EBS snapshots iSCSI local interface Up to 16TB volumes

Up to 12 volumes

Gateway cached volumes

Data stored in S3 Recently read data cached

Low latency iSCSI local interface Up to 32TB volumes

Up to 32 volumes

Page 53: Journey Through The Cloud - Disaster Recovery

AWS Storage GatewayGateway-Virtual Tape Library (VTL)

http://docs.aws.amazon.com/storagegateway/latest/userguide/Requirements.html#requirements-backup-sw-for-vtl

Page 54: Journey Through The Cloud - Disaster Recovery

Storage appliances & backup management

Page 55: Journey Through The Cloud - Disaster Recovery

RDS & Oracle RMAN

https://d0.awsstatic.com/whitepapers/strategies-for-migrating-oracle-database-to-aws.pdf

Page 56: Journey Through The Cloud - Disaster Recovery

Common Architectures for Disaster RecoveryNext, let’s take a look at the Pilot Light pattern

Backup & Restore Pilot light

Warm standby in AWS Multi-site solution AWS & on-premises

Page 57: Journey Through The Cloud - Disaster Recovery

Pilot light architecture

Build resources around replicated dataset

Keep ‘pilot light’ on by replicating core databases

Build AWS resources around dataset and leave in stopped state

Page 58: Journey Through The Cloud - Disaster Recovery

Pilot light architecture

Build resources around replicated dataset

Scale AWS resources in response to a DR event

Keep ‘pilot light’ on by replicating core databases

Build AWS resources around dataset and leave in stopped state

Start up pool of resources in AWS when events dictate

Match required production capacity through auto-scaling policies

Page 59: Journey Through The Cloud - Disaster Recovery

Pilot light architecture

Build resources around replicated dataset

Scale AWS resources in response to a DR event

Keep ‘pilot light’ on by replicating core databases

Build AWS resources around dataset and leave in stopped state

Start up pool of resources in AWS when events dictate

Match required production capacity through auto-scaling policies

Cut over to the system in AWS

Page 60: Journey Through The Cloud - Disaster Recovery

Stopped instances

Pilot Light

Page 61: Journey Through The Cloud - Disaster Recovery

Running instances

Pilot Light

Page 62: Journey Through The Cloud - Disaster Recovery

RESOURCES YOU CAN USETO LEARN MORE

Page 63: Journey Through The Cloud - Disaster Recovery

aws.amazon.com/disaster-recovery/

Page 64: Journey Through The Cloud - Disaster Recovery

AWS Disaster Recovery White Paper

Amazon Web Services – Using AWS for Disaster Recovery October 2014

Page 1 of 22

Using Amazon Web Services for Disaster Recovery October 2014

Glen Robinson, Attila Narin, and Chris Elleman

Amazon Web Services – Using AWS for Disaster Recovery October 2014

Page 2 of 22

Contents Introduction ...............................................................................................................................................................3

Recovery Time Objective and Recovery Point Objective ................................................................................................4

Traditional DR Investment Practices ............................................................................................................................4

AWS Services and Features Essential for Disaster Recovery ...........................................................................................5

Example Disaster Recovery Scenarios with AWS ...........................................................................................................9

Backup and Restore ................................................................................................................................................9

Pilot Light for Quick Recovery into AWS ................................................................................................................. 11

Warm Standby Solution in AWS ............................................................................................................................. 14

Multi-Site Solution Deployed on AWS and On-Site .................................................................................................. 16

AWS Production to an AWS DR Solution Using Multiple AWS Regions ...................................................................... 18

Replication of Data ................................................................................................................................................... 18

Failing Back from a Disaster....................................................................................................................................... 19

Improving Your DR Plan ............................................................................................................................................ 20

Software Licensing and DR ........................................................................................................................................ 21

Conclusion ............................................................................................................................................................... 21

Further Reading........................................................................................................................................................ 22

Document Revisions ................................................................................................................................................. 22

Amazon Web Services – Using AWS for Disaster Recovery October 2014

Page 14 of 22

Warm Standby Solution in AWS The term warm standby is used to describe a DR scenario in which a scaled-down version of a fully functional environment is always running in the cloud. A warm standby solution extends the pilot light elements and preparation. It further decreases the recovery time because some services are always running. By identifying your business-critical systems, you can fully duplicate these systems on AWS and have them always on.

These servers can be running on a minimum-sized fleet of Amazon EC2 instances on the smallest sizes possible. This solution is not scaled to take a full-production load, but it is fully functional. It can be used for non-production work, such as testing, quality assurance, and internal use.

In a disaster, the system is scaled up quickly to handle the production load. In AWS, this can be done by adding more instances to the load balancer and by resizing the small capacity servers to run on larger Amazon EC2 instance typ es. As stated in the preceding section, horizontal scaling is preferred over vertical scaling.

Preparation phase

The following figure shows the preparation phase for a warm standby solution, in which an on-site solution and an AWS solution run side-by-side.

Figure 6: The Preparation Phase of the Warm Standby Scenario.

Amazon Web Services – Using AWS for Disaster Recovery October 2014

Page 16 of 22

Multi-Site Solution Deployed on AWS and On-Site A multi-site solution runs in AWS as well as on your existing on-site infrastructure, in an active-active configuration. The data replication method that you employ will be determined by the recovery point that you choose. For more information about recovery point options, see the Recovery Time Objective and Recovery Point Objective section in this whitepaper.

In addition to recovery point options, there are various replication methods, such as synchronous and asynchronous methods. For more information, see the Replication of Data section in this whitepaper.

You can use a DNS service that supports weighted routing, such as Amazon Route 53, to route production traffic to different sites that deliver the same application or service. A proportion of traffic will go to your infrastructure in AWS, and the remainder will go to your on-site infrastructure.

In an on-site disaster situation, you can adjust the DNS weighting and send all traffic to the AWS servers. The capacity of the AWS service can be rapidly increased to handle the full production load. You can use Amazon EC2 Auto Scaling to automate this process. You might need some application logic to detect the failure of the primary database services and cut over to the parallel database services running in AWS.

The cost of this scenario is determined by how much production traffic is handled by AWS during normal operation. In the recovery phase, you pay only for what you use for the duration that the DR environment is required at full scale. You can further reduce cost by purchasing Amazon EC2 Reserved Instances for your “always on” AWS servers.

Preparation phase

The following figure shows how you can use the weighted routing policy of the Amazon Route 53 DNS to route a portion of your traffic to the AWS site. The application on AWS might access data sources in the on-site production system. Data is replicated or mirrored to the AWS infrastructure.

Figure 8: The Preparation Phase of the Multi-Site Scenario.

http://media.amazonwebservices.com/AWS_Disaster_Recovery.pdf

Page 65: Journey Through The Cloud - Disaster Recovery

aws.amazon.com/vpc

aws.amazon.com/directconnect

aws.amazon.com/s3

aws.amazon.com/glacier

aws.amazon.com/storagegateway

Page 66: Journey Through The Cloud - Disaster Recovery

AWS re:Invent 2015 | (STG304) Deploying a Disaster Recovery Site on AWS

https://www.youtube.com/watch?v=bXrGUlgbl-s&list=PLhr1KZpdzukdTMmq1gkXs7g6WIIXtL5r9&index=15

Page 67: Journey Through The Cloud - Disaster Recovery

aws.amazon.com/architecture/

Page 68: Journey Through The Cloud - Disaster Recovery

Certification

aws.amazon.com/certification

Self-Paced Labs

aws.amazon.com/training/self-paced-labs

Try products, gain new skills, and get hands-on practice working

with AWS technologies

aws.amazon.com/training

Training

Validate your proven skills and expertise with the AWS platform

Build technical expertise to design and operate scalable, efficient applications on AWS

AWS Training & Certification

Page 69: Journey Through The Cloud - Disaster Recovery

Follow us fo

r more

events

& webina

rs

@AWScloud for Global AWS News & Announcements

@AWS_UKI for local AWS events & news

@IanMmmmIan Massingham — Technical Evangelist