disaster recovery planning - macquarie...

12
To print to A4, print at 75%. DISASTER RECOVERY PLANNING

Upload: lamkhue

Post on 16-Feb-2019

216 views

Category:

Documents


0 download

TRANSCRIPT

To print to A4, print at 75%.

DISASTER RECOVERY PLANNING

P2

TABLE OF CONTENTS

EXECUTIVE SUMMARY

WHAT IS A DISASTER RECOVERY PLAN (DRP)?

WHY SHOULD MY COMPANY HAVE ONE?

WHAT DISASTERS SHOULD WE PREPARE FOR?

HOW TO CREATE AN EFFECTIVE DRP

COMMON DRP MISTAKES

NEED HELP PROTECTING YOUR BUSINESS?

CHAPTER 01

CHAPTER 02

CHAPTER 03

CHAPTER 04

CHAPTER 05

CHAPTER 06

EXECUTIVE SUMMARY 01

02

03

04

05

06

EXEC

UTI

VE S

UM

MAR

Y

This comprehensive guide will help

you understand what a disaster

recovery plan is and how to

effectively implement one for your

business. It was designed to serve

as a primer in disaster management

and preparation.

DISASTERS CAN BE DEVASTATING IF NOT PREPARED FOR

Whether man made or naturally

occurring, disasters can be a real

threat to a company’s survival. Some

common disasters include:

» Catastrophic security compromise

» Fire

» Flood

» Earthquake

» Power failure

In many cases the negative effects

of disasters can be prevented or

greatly reduced if your company is

ready to act and the proper steps are

promptly taken.

Careful disaster recovery planning

can mitigate damage and reduce the

risks of major data and profit loss.

EVERY EFFECTIVE DISASTER RECOVERY PLAN SHOULD FOLLOW THESE 6 SPECIFIC STEPS

Although every disaster recovery plan

will be different, all effective disaster

recovery planners should follow these

key steps:

» Delegate responsibilities

» Perform risk assessment

» List your recovery objectives

» Formulate your plan

» Test your plan

» Implement your plan

ABOUT MACQUARIE TELECOM

Macquarie Telecom provides dynamic hosting and communications

platforms that companies can truly rely on for the delivery of their corporate

communications and applications. Combining business-grade full line

telecommunications (voice, data and mobile) with Australian owned and located

hosting services, Macquarie Telecom is not just a single solution – it is an end-

to-end communications platform that enables customers to make smarter

decisions on how to run and build their businesses.

P3

WHAT ISA DISASTER RECOVERY PLAN?

CHAPTER 1

WH

AT IS

A D

ISAS

TER

REC

OVE

RY

PLA

N?

In the event of a man made or

natural disaster, it is critical that

your company be properly prepared.

A disaster recovery plan (DRP)

provides clear instructions for the

recovery and protection of your IT

infrastructure in disaster scenarios.

A DRP can take the form of written

or verbal instructions, but, in order

to maximise effectiveness, it is

usually explained to staff through

training sessions and distributed in

written form.

WHAT’S THE DIFFERENCE BETWEEN A DRP AND A BUSINESS CONTINUITY PLAN?

Although sometimes confused, the

disaster recovery plan and business

continuity plan (BCP) are separate

concepts. The DRP is a subset of the

more general BCP, which contains

five components, including:

» Business Resumption Plan

» Occupant Emergency Plan

» Continuity of Operations Plan

» Incident Management Plan

» Disaster Recovery Plan

The disaster recovery plan provides a

guide for returning IT infrastructure

to normalcy following a disaster,

whereas the other elements of the

BCP deal with non-IT related issues.

59% OF FORTUNE 500 COMPANIES EXPERIENCE AN AVERAGE OF 1.6 HOURS

OF DOWNTIME EVERY WEEK, WHICH TRANSLATES TO AN AVERAGE YEARLY DOWNTIME COST OF $2.79 MILLION.

01

02

03

04

05

06

P4

Every second of IT downtime

can have devastating effects on

customer experience, revenue and

your company’s image. An effective

disaster recovery plan can greatly

reduce the costs associated with a

disaster event.

By defining clear guidelines before

disaster strikes, you give your IT

team the tools they need to react and

recover faster.

DISASTERS CAN BE COSTLY

The financial ramifications of even

short periods of downtime can be

staggering. Costs can be in the form

of lost revenue, lost productivity, and

costs associated with returning to

normalcy.

Companies who rely on e-commerce,

telecommunications or other IT

services for their revenue stream can

be particularly affected by downtime,

with losses of up to $11,000 and

averaging $5,600 across companies

per minute of downtime.[1]

Downtime is also remarkably

prevalent, even in the largest

companies. 59% of Fortune 500

companies experience an average of

1.6 hours of downtime every week,

which translates to an average yearly

downtime cost of $2.79 million.[2]

DISASTERS CAN NEGATIVELY AFFECT YOUR COMPANY’S IMAGE

Although much more difficult to

quantify than financial costs, the

effects of disaster related downtime

on a company’s reputation can be

significant. Half of all companies

surveyed report that downtime has

a negative effect on a company’s

image and 35% reported that they

believe downtime would negatively

affect their customers’ loyalty.[3]

When customers cannot access

your website, applications, or get

proper assistance, goodwill can be

significantly diminished.

DISASTER RECOVERY PLANS REDUCE THE NEGATIVE EFFECTS OF DISASTER DOWNTIME

DRPs can help a company deal with

disaster in two ways.

THEY REDUCE THE LENGTH OF

DOWNTIME

DRPs can allow your company to

return to normalcy much more

quickly. This can be invaluable when

downtime costs can be as high as

$11,000 per minute.

THEY MAKE DISASTERS LESS

COSTLY

Aside from reducing the length of

downtime, disaster recover planning

can actually make the effects of

downtime less destructive.

If your company has a contingency

plan to recover lost data and

communicate with customers, even

in periods of disaster, losses will be

minimised.

WH

Y SH

OU

LD M

Y C

OM

PAN

Y H

AVE

ON

E?

WHY SHOULDMY COMPANY HAVE ONE?

CHAPTER 2

01

02

03

04

05

06

P5

Disasters are generally defined as

any man made or natural event that

causes substantial destruction.

As they relate to the protection of

your data and IT infrastructure, the

most common disasters you should

be prepared for are:

NATURAL DISASTERS

FIRE

Fires can cause loss of

telecommunications infrastructure,

electricity, structures and personnel.

This is one of the most devastating

and common disasters.

FLOOD

This can be either caused by

large natural phenomena such as

rainstorms, or more modest man

made sources like a water leak.

If your company or IT infrastructure is

located in a flood prone area, they can

be extremely destructive.

EARTHQUAKE

Major earthquakes can strike without

warning and are one of the most

difficult disasters to prepare for.

TROPICAL CYCLONE

These are seasonal disasters that

can severely damage coastal IT

infrastructure.

MAN MADE DISASTERS

POWER FAILURE

Power failure, especially for extended

periods of time can be very difficult to

manage.

CATASTROPHIC SECURITY

COMPROMISE

Although not always categorised as

a disaster, IT security compromises,

such as hacking or deliberate actions

by a rogue employee, can nonetheless

be incredibly destructive and result

in a loss of data or customer trust.

Your team should be prepared for all

malicious attacks.

RIOTING

Riots caused by civil unrest or

other disasters can be difficult to

predict or control. Make sure your

IT infrastructure is always properly

secured.

This is only a cursory list of the

most common disasters affecting

IT infrastructure. In order to be fully

prepared, make your own list of

potential disasters.

WH

AT D

ISAS

TER

S SH

OU

LD W

E P

REP

ARE

FOR

?

WHAT DISASTERSSHOULD WE PREPARE FOR?

CHAPTER 3

01

02

03

04

05

06

P6

Every disaster recovery plan will be

unique as they must be customised to

fit each company’s risks and needs.

However, there are certain steps that

must always be followed in order to

create effective DRPs.

DELEGATE RESPONSIBILITY

This is an often overlooked step, but

one of the most important. At the

outset of the planning process, it

is critical that a leader be selected

and that responsibilities be clearly

delegated before disaster strikes.

This increases accountability and

efficiency during times of crisis.

GET UPPER MANAGEMENT

SUPPORT

If there is no management

commitment to the creation and

follow through on a DRP, then the

plan will not succeed. The leaders

of your company must be fully

responsible for the success, or

failure, of the plan so that it can attain

the necessary financial and human

resources.[4]

FORM A COMMITTEE TO OVERSEE

DRP CREATION

Once management is fully committed,

the next step is to create a committee

to create and approve the plan. This

committee will likely be composed of

technical experts within the company

who will be responsible for developing

the content of the plan and

management, who will approve and

oversee the plan’s implementation.

HO

W T

O C

REA

TE A

N E

FFEC

TIVE

DR

P

PERFORM A RISK ANALYSIS

In order to be properly prepared

for disaster, it is important to first

identify which disasters will have the

most impact on your business and

which are most likely to occur.

The risk analysis process identifies

the likelihood of a disaster occurring

and analyses the possible results

should that disaster occur. This

gives your company an idea of which

disasters pose the greatest threats

and allows you to properly prioritise

your resources.

ORDER THREATS BY THEIR RISK

SCORE

In order to create an objectively

prioritised list of the greatest threats

to your organisation, it is necessary

for the planning committee to

quantify each possible disaster.

Cisco Systems disaster recovery

experts recommend you start by

assessing the probability that each

event will occur on a scale from 1-10,

then assessing the potential impact

on your business and time to return

to normalcy using the same scale.

Add the scores together to get the

risk score for each threat.[5]

Below is an example risk assessment

table with the typical scores

associated with each threat. Your

business’s own scores may vary

depending on location and industry.

Business Risk Component Probability Impact Total Risk Score

Human Error(s) (5) Medium (10) High (15) M / H

Software Bug (3) Low (10) High (13) L / H

Hardware Failure (8) High (10) High (18) H / H

Security Breach (3) Low (6) Medium (9) L / M

Fibre Cut (10) High (10) High (20) H / H

Natural Disaster (2) Low (10) High (12) L / H

Civil Unrest (2) Low (5) Medium (7) L / M

HOW TO CREATEAN EFFECTIVE DRP

CHAPTER 4

01

02

03

04

05

06

P7

DETERMINE POSSIBLE OUTCOMES

OF EACH RISK

Once the risks have been assessed and

ordered, the committee must begin

the process of listing all the possible

outcomes should any of the major

threats occur. This list will be your

template when deciding what issues

need to be addressed in your plan.

LIST YOUR OBJECTIVES

After preparing the risk assessment,

it’s time to start listing your recovery

objectives. This will enumerate the

goals of the DRP and allow you to

create more effective recovery plans.

PRIORITISE YOUR RECOVERY

Determine which applications are

the most valuable to your business

and which can be offline for longer

periods of time. This provides the

information necessary to properly

respond and reduce the impact of a

disaster.

DETERMINE YOUR RECOVERY POINT

OBJECTIVE (RPO)

The recovery point objective is the

farthest point in the past from which

data can be recovered. For example,

if the RPO is one hour, data must

be backed up to a separate secure

location every hour in order to meet

the objective. Your company’s choice

of RPO will likely depend on the value

of the data stored and the rate at

which your company generates data.

DETERMINE YOUR RECOVERY TIME

OBJECTIVE (RTO)

Commonly confused with recovery

point objective, the recovery time

objective is the maximum amount of

time an IT system or application can

be offline. For example, a recovery

time objective of four hours indicates

that systems must be back online

within four hours. Your recovery time

objective might vary depending on the

severity and type of disaster and may

not necessarily be realistic, but rather

the optimal time in which normal

operations should resume.

FORMULATE A DRP

Once the prep work is done, the

committee can start creating the

DRP itself. This should take the form

of a written document and be made

available to all relevant parties once

it is completed. Additional verbal

training is also recommended.

HO

W T

O C

REA

TE A

N E

FFEC

TIVE

DR

P

DETERMINE POSSIBLE OUTCOMES

OF EACH RISK

Once the risks have been assessed and

ordered, the committee must begin

the process of listing all the possible

outcomes should any of the major

threats occur. This list will be your

template when deciding what issues

need to be addressed in your plan.

LIST YOUR OBJECTIVES

After preparing the risk assessment,

it’s time to start listing your recovery

objectives. This will enumerate the

goals of the DRP and allow you to

create more effective recovery plans.

PRIORITISE YOUR RECOVERY

Determine which business

applications and systems are the

most valuable to your business

and which can be offline for longer

periods of time. Outline which services

and functions of the business need

continuity, and which do not.

This provides the information

necessary to properly respond and

reduce the impact of a disaster.

DETERMINE YOUR RECOVERY POINT

OBJECTIVE (RPO)

The recovery point objective is the

farthest point in the past from which

data can be recovered. For example,

if the RPO is one hour, data must

be backed up to a separate secure

location every hour in order to meet

the objective. Your company’s choice

of RPO will likely depend on the value

of the data stored and the rate at

which your company generates data.

DETERMINE YOUR RECOVERY TIME

OBJECTIVE (RTO)

Commonly confused with recovery

point objective, the recovery time

objective is the maximum amount of

time an IT system or application can

be offline. For example, a recovery

time objective of four hours indicates

that systems must be back online

within four hours. Your recovery time

objective might vary depending on the

severity and type of disaster, and may

not necessarily be realistic, but rather

the optimal time in which normal

operations should resume.

FORMULATE A DRP

Once the prep work is done, the

committee can start creating the

DRP itself. This should take the form

of a written document and be made

available to all relevant parties once

it is completed. Additional verbal

training is also recommended.

THE EXTENT OF THE DAMAGE IS OFTEN DETERMINED IN THE EARLY STAGES OF A DISASTER. IT IS CRITICAL THAT THE DRP

INCLUDES FIRST RESPONSE INSTRUCTIONS FOR LIKELY DISASTER SCENARIOS.

01

02

03

04

05

06

P8

CREATE A DETECTION PLAN

Prior to the actual recovery process,

it is necessary to first determine that

a disaster has actually occurred.

It is important that the DRP not be

initiated until a full assessment of

the damage has taken place, so as to

limit false alarms and unnecessary

disruptions to work activity.

DEVELOP THE RECOVERY

PROCEDURE

This is the most important step in

your DRP as it will determine how

your team responds after a disaster

has been declared. The disaster

planning committee must make

several key decisions to ensure a

quick return to normalcy.

» Make a first response directive.

The extent of the damage is often

determined in the early stages of a

disaster. For this reason it is critical

that the DRP include first response

instructions for likely disaster

scenarios. This might include

shutting off utilities, assessing

damage, preparing backup power,

and/or powering off equipment.

» Plan backup sites. To ensure that

work can continue with minimal

interruption, the committee

must choose backup sites where

computing can continue with

minimal interruption. In this there

are several options. Hot sites are a

near replica of the original working

site with real time backups of

data and fully equipped hardware

ready to be used immediately. Cold

sites are simply separate spaces

in which work operations can be

moved. They do not contain backup

hardware or data, so operations

may take some time to resume.

Because of the cost associated

with hot sites and the slow recovery

time of cold sites, many companies

choose to operate warm sites,

which are smaller scale versions

of the original work site, with data

backups that may be hours to days

old and backup equipment that

is not as extensive as that at the

original site.

» Source replacement hardware. In

the event that necessary hardware

is damaged or destroyed, it is

important to have a reliable, up-to-

date source for replacements. Make

a list of all mission critical devices

along with a reliable replacement

source. In some cases your company

may find it necessary to keep backup

hardware on hand or to leverage an

Infrastructure as a Service (IaaS)

HO

W T

O C

REA

TE A

N E

FFEC

TIVE

DR

P

platform to speed the recovery of

certain applications.

» Source backup personnel. During

some disasters, personnel may be

unable to work. In these cases it

can be necessary to call additional

help. In order to expedite this

process, your recovery plan should

include a source of off site human

resources.

» Make your plan responsive.

The best plans recognise that

it is impossible to foresee every

situation. When drafting your

DRP, include instructions that

are adaptable to a wide variety

of scenarios. This will allow your

recovery team to quickly get

operations up and running again,

no matter what happens.

PLAN FOR RECONSTRUCTION

After the disaster has passed, your

team will need instructions on how

to return to normalcy. This might

include work site inspections for

structural damage, purchasing new

equipment, installing new hardware,

and systems testing. It should also

include guidelines for how and when

staff should return to work.

COMPILE THE DRP DOCUMENT

After the disaster recovery plan has

been carefully formulated, it must

be formatted into a clear, concise

document. The instructions should

be simple and easily followed, but

detailed enough to cover any potential

issues that might arise. Creating a

document that is effective in times

of an actual emergency can be

challenging, so significant effort

should be made to ensure that it is

well made.

TEST THE PLAN

This step will reveal any flaws in the

DRP and offer insights into how it

can be improved. The plan should

first be carefully reviewed by the DRP

committee and checked for obvious

errors. After this initial evaluation has

been completed, a dry run should be

initiated in which testers simulate

potential disasters.

Plans should be judged by how well

they meet their RTO and RPO goals

in simulations. If the results

of testing are unsatisfactory, it may

be necessary to significantly revise

the DRP.

01

02

03

04

05

06

P9

IMPLEMENT THE PLAN

After a DRP has successfully passed

the testing phase, it must be approved

by management. At this point

additional changes informed by cost or

resource concerns may be made and

the plan may have to go through more

rounds of testing and revision.

If the plan is approved, it should

be immediately implemented

by management. This includes

distribution of the written plan to

all relevant employees and training.

Management should also create

a regular review schedule for the

DRP so that it can be updated to

address any changes that may occur

in the future.

HO

W T

O C

REA

TE A

N E

FFEC

TIVE

DR

P

DETERMINE POSSIBLE OUTCOMES

OF EACH RISK

Once the risks have been assessed and

ordered, the committee must begin

the process of listing all the possible

outcomes should any of the major

threats occur. This list will be your

template when deciding what issues

need to be addressed in your plan.

LIST YOUR OBJECTIVES

After preparing the risk assessment,

it’s time to start listing your recovery

objectives. This will enumerate the

goals of the DRP and allow you to

create more effective recovery plans.

PRIORITISE YOUR RECOVERY

Determine which applications are

the most valuable to your business

and which can be offline for longer

periods of time. This provides the

information necessary to properly

respond and reduce the impact of a

disaster.

DETERMINE YOUR RECOVERY POINT

OBJECTIVE (RPO)

The recovery point objective is the

farthest point in the past from which

data can be recovered. For example,

if the RPO is one hour, data must

be backed up to a separate secure

location every hour in order to meet

the objective. Your company’s choice

of RPO will likely depend on the value

of the data stored and the rate at

which your company generates data.

DETERMINE YOUR RECOVERY TIME

OBJECTIVE (RTO)

Commonly confused with recovery

point objective, the recovery time

objective is the maximum amount of

time an IT system or application can

be offline. For example, a recovery

time objective of four hours indicates

that systems must be back online

within four hours. Your recovery time

objective might vary depending on the

severity and type of disaster and may

not necessarily be realistic, but rather

the optimal time in which normal

operations should resume.

FORMULATE A DRP

Once the prep work is done, the

committee can start creating the

DRP itself. This should take the form

of a written document and be made

available to all relevant parties once

it is completed. Additional verbal

training is also recommended.

MANAGEMENT SHOULD ALSO CREATE A REGULAR REVIEW SCHEDULE FOR THE

DRP SO THAT IT CAN BE UPDATED TO ADDRESS ANY CHANGES THAT MAY OCCUR

IN THE FUTURE.

01

02

03

04

05

06

P10

Disaster recovery planning is a

complicated and difficult process

involving many people and long hours.

As such, it frequently results in an

imperfect DRP. We’ve listed some of

the most common mistakes to avoid.

NOT TRAINING EMPLOYEES ON THE DRP

After the DRP document has

been created and distributed, it is

important that the whole team be

trained on it. This will make the plan

clearer, give employees the chance to

practise using it, and allow them to

ask any questions they may have.

AIMING TOO LOW WITH RTOS AND RPOS

It is important to remember that

recovery point objectives and recovery

time objectives are goals, not

requirements. As such, they should be

set to represent the optimal recovery

process, not necessarily the likely one.

This will encourage your team to work

harder and be more diligent in the

planning and recovery process.

NOT CONDUCTING END USER TESTING

Until the end user is able to use an

application, testing is not complete.

Services may start and appear to be

working properly on the back-end

but be inoperable on the user’s end.

This situation can be among the most

damaging if not prepared for, as the

IT team will be unaware that there is

a problem and unable to respond.

NOT UPDATING THE PLAN

DRPs should be updated at least

once a year, and whenever a business

application or process is changed.

This ensures the plan includes

updated hardware, evolving business

structure, and other changes. Many

companies overlook this step only to

find a previously effective DRP doesn’t

CO

MM

ON

DR

P M

ISTA

KES

COMMONDRP MISTAKES

CHAPTER 5 perform under current conditions.

MAKING THE PLAN TOO COMPLICATED

Although plans should be

thorough and include all necessary

information, they should not be

overly long or complicated. Prioritise

information and present it in clear,

concise steps so your team can react

quickly and properly, even in the most

stressful situations.

NOT DELEGATING PROPER RESOURCES TO THE DRP

This incredibly common mistake

can destroy the chances of disaster

recovery success. Many companies

believe that the risks of disaster are

slim and disaster recovery planning

is a low priority. However, disasters,

though rare, can threaten the

viability of your company. Only those

companies that are properly prepared

will suffer minimal losses in the

worst disasters. This includes having

complete reliance on only a handful of

individuals who may also be affected

by the same disaster, for example a

bushfire or flood.

STORING THE DRP ON THE NETWORK

This may sound like common sense,

but we do know of at least one

company that kept their Disaster

Recovery Plan on the network and

were unable to access the information

during a disaster. Think holistically

about what elements are required to

reinstate a failed IT service, including

installation media, servers, storage

and installation instructions.

01

02

03

04

05

06

P11

Macquarie Telecom’s LAUNCH

Disaster Recovery provides

completely outsourced disaster

recovery solutions at the hypervisor

level.

With one of the lowest downtimes

of any disaster recovery service,

LAUNCH can help your company

mitigate losses and get up and

running again faster.

WANT TO LEARN MORE ABOUT HOW

LAUNCH CAN HELP YOUR COMPANY

PREPARE FOR DISASTER?

Contact Macquarie Telecom

on 1800 004 943 or visit

macquarietelecom.com

REFERENCES:

» [1] Ponemon Institute Study Quantifies Cost of Data

Center Downtime. Emerson Network Power. 2011.

» [2] Assessing the Financial Impact of Downtime.

http://www.businesscomputingworld.co.uk/

assessing-the-financial-impact-of-downtime/.

Business Computing World. 2011

» [3] http://www.arcserve.com/us/lpg/~/media/Files/

SupportingPieces/ARCserve/avoidable-cost-of-

downtime-summary-phase-2.pdf

» [4] http://www.drj.com/new2dr/new2dr/w2_002.htm

» [5] http://www.cisco.com/en/US/technologies/

collateral/tk869/tk769/white_paper_c11-453495.html

NEE

D H

ELP

PR

OTE

CTI

NG

YO

UR

BU

SIN

ESS?

NEED HELPPROTECTING YOUR BUSINESS FROM DISASTER?

CHAPTER 6

01

02

03

04

05

06

P12

© March 2014 Macquarie Telecom, All Rights Reserved