protect your app from outages

32
Protect your app from Outages Ron Zavner, Applications Architect at Gigaspaces February 2013

Upload: ron-zavner

Post on 12-Jun-2015

773 views

Category:

Technology


0 download

DESCRIPTION

How to protect your application from outages and failures of cloud infrastructures. Planning disaster recovery architecture and use Cloudify for cloud abstraction and monitoring.

TRANSCRIPT

Page 1: Protect your app from Outages

Protect your app from OutagesRon Zavner, Applications Architect at Gigaspaces

February 2013

Page 2: Protect your app from Outages

2

AWS and outages Outage impact Disaster Recovery – it’s all about redundancy! Cloudify as a solution for redundancy Demo with Cloudify on EC2

® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

AGENDA

Page 3: Protect your app from Outages

3

AWS USAGE

Managing Big Data on the Cloud

• AWS – around 0.5M servers• Facebook – less than 0.1M servers• Google – around 1M servers

Page 4: Protect your app from Outages

4

THE OUTAGE PROBLEM

Page 5: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved5

OUTAGE – APRIL 21, 2011

Page 6: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved6

OUTAGE - JUNE 29, 2012

Page 7: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved7

OUTAGE - OCTOBER 22, 2012

Page 8: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved8

OUTAGE - CHRISTMAS EVE 2012

Page 9: Protect your app from Outages

9

THAT’S WHAT YOU EXPECT?

Managing Big Data on the Cloud

99% - 3.65 days downtime99.9% - 8.76 hours downtime99.99% - 53 minutes downtime99.999% - 5.26 minutes downtime

Page 10: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved10

OUTAGE IMPACT – DESIGN FOR FAILURES

Outage could cost…$89K per hour for Amadeus$225K per hour for PayPal!

Page 11: Protect your app from Outages

11

DISASTER RECOVERY

Page 12: Protect your app from Outages

12

MULTI CLOUD

Managing Big Data on the Cloud

Page 13: Protect your app from Outages

13

PREPARE FOR DISASTER RECOVERY

Managing Big Data on the Cloud

•Dedicated expert for DR architecture•Define target recovery time & point•Assume every tier can fail•Use monitoring and alerts•Document your operational processes

Page 14: Protect your app from Outages

14

CHAOS MONKEY

Managing Big Data on the Cloud

Page 15: Protect your app from Outages

15

It’s all about REDUNDANCY!

Page 16: Protect your app from Outages

16

CLONE YOUR ENVIORMENT

Managing Big Data on the Cloud

Page 17: Protect your app from Outages

17

CLONE YOUR DATA

•RDS Read Replica•More to come…

Page 18: Protect your app from Outages

18

You must use an AUTOMATION layer

Page 19: Protect your app from Outages

CLOUDIFY POSITIONING IN THE CLOUD STACK

19

PaaS

IaaS

DevOps(Automation)

Productivity

Control

ChefPuppet

CloudFoundryHeroku

GAEOpenShift

Rightscale

Public clouds(AWS, Rackspace,..) Private clouds

(Vmware, OpenStack..)

High productivity with full control

Enstratus

Page 20: Protect your app from Outages

CLONE YOUR ENV - HOW DOES IT WORK?

Page 21: Protect your app from Outages

® Copyright 2012 GigaSpaces. All Rights Reserved21

EXTENSIVE PLATFORM SUPPORT

Page 22: Protect your app from Outages

22

USE ANY CLOUD

Managing Big Data on the Cloud

Page 23: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved23

GETTING COMPUTE RESOURCES IN A PORTABLE WAY

compute { template "SMALL_LINUX"}

SMALL_LINUX : template imageId "us-east-1/ami-76f0061f“ remoteDirectory "/home/ec2-user/gs-files“ machineMemoryMB 1600 hardwareId "m1.small" locationId "us-east-1" localDirectory "upload" keyFile "myKeyFile.pem"

options ([ "securityGroups" : ["default"]as

String[], "keyPair" : "myKeyFile"])

overrides (["jclouds.ec2.ami-query":"",

"jclouds.ec2.cc-ami-query":""])privileged true

}

SMALL_LINUX : template{ imageId "1234" machineMemoryMB 3200 hardwareId "103" remoteDirectory "/root/gs-files" localDirectory "upload" keyFile "gigaPGHP.pem" options ([ "openstack.securityGroup" : "default", "openstack.keyPair" : "gigaPGHP"

])privileged true

}

Page 24: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved24

DATA REPLICATION

• Cloudify Replicated MySQL Recipe• Generic replication service using WAN Gateway

Page 25: Protect your app from Outages

GENERIC REPLICATION SERVICE OVER WAN

Hong Kong

London

New York

In-Memory Speed High Availability and Self-HealingScalable and Efficient

Page 26: Protect your app from Outages

26

Real Life Scenario

Page 27: Protect your app from Outages

VERIFI (CURRENT) DEPLOYMENT ARCHITECTURE

27

Availability region (US-West: Oregon)

Data VolumeInternet EC2 Instance

mod_cluster

EC2 Instance

JBoss

Data Volume

EC2 Instance

EC2 Instance

PostgresSQL

Cassandra

4 recipes

Page 28: Protect your app from Outages

TARGET ARCHITECTURE

Availability Region (US-West Oregon)

Data Volume

Internet EC2 Instance

mod_cluster

EC2 Instance

JBoss

Data Volume

Postgres MasterEC2 Instance

EC2 Instance

Cassandra

Availability Region (US-East Virginia)

Data Volume

EC2 Instance

mod_cluster

EC2 Instance

JBoss

Data Volume

Postgres SlaveEC2 Instance

EC2 Instance

Cassandra

replication

Bootstrap two EC2 clouds in different regions, install the “verifi” application on each. The second cloud will have a slightly modified (extended) postgres recipe for acting as a slave + no running app servers. Upon the primary zone failure, the second cloud will spin up instances of the app servers and turn the data instance into master, then bootstrapping another “slave” cloud in another zone.

Page 29: Protect your app from Outages

FAILOVER SCENARIO

29

Region (US-West Oregon)

App ServersPostgresSQL

Region (US-East Virginia)

PostgresSQL

Cloud #1 Cloud #2

Region (US-East Virginia )

PostgresSQL

Cloud #1 Cloud #2

XApp Servers

Region (US-West California)

PostgresSQL

Cloud #3

Region failure occurs

Bootstrap another cloud in a different region using the same application recipe used to bootstrap cloud #2 above*

1 2 3

Liveness poll

Liveness poll

0 Upon initial deployment, the primary deployment of the application will be bootstrapped onto cloud #1, another slightly modified application recipe will be bootstrapped as cloud #2, polling cloud #1 for failure, and acting as a PostgresSQL db slave.

Turn Postgres slave into master, Start app server instances*

Page 30: Protect your app from Outages

® Copyright 2012 GigaSpaces Ltd. All Rights Reserved30

DEMO ON EC2 - 5 MINUTES SETUP

/* Credentials - You must enter your * cloud provider account credentials */

user="ENTER_USER_HERE"apiKey="ENTER_API_KEY_HERE"keyFile="ENTER_KEY_FILE_HERE"keyPair="ENTER_KEY_PAIR_HERE"

// Advanced usage

hardwareId="m1.small"locationId="us-east-1"linuxImageId="us-east-1/ami-1624987f"ubuntuImageId="us-east-1/ami-82fa58eb"

Page 31: Protect your app from Outages

31

AWS and outages Outage impact Disaster Recovery – it’s all about redundancy!

Cloning your environment – app stack Cloning your DB – Replication

Cloudify as a solution for Redundancy Use recipes to work on any cloud Fast and customized data replication

Demo with Cloudify on EC2

® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

SUMMARY

Page 32: Protect your app from Outages

32

Thank [email protected]

® Copyright 2013 GigaSpaces Ltd. All Rights Reserved

QUESTIONS & ANSWERS