improve your it disaster recovery plan, and your ability to recover from disaster

7
This research note is restricted to the personal use of Aristotle Castro ([email protected]). Improve Your IT Disaster Recovery Plan, and Your Ability to Recover From Disaster 4 June 2012 | ID:G00234709 Kevin Knox Many organizations have inconsistent IT disaster recovery plans that vary in quality, scope and detail. We help disaster recovery and business continuity planners improve their IT disaster recovery plans, and their ability to recover from disaster, by outlining best practices for key problems. Overview Explore related content: "SMB Context: 'Improve Your IT Disaster Recovery Plan, and Your Ability to Recover From Disaster.'" (17 September 2012) Key Challenges Minor discrepancies, omissions and oversights in an organization's disaster recovery plan can have a major impact on the time required to recover from a disaster and the associated business impact. While most organizations claim to have some form of IT disaster recovery plan in place, there are wide-ranging differences in quality, scope and detail level from one plan to another. Respondents to the 2011 Gartner Risk Management Disciplines Survey were asked which types of disasters their organizations planned for. IT outage was ranked highest among the 13 categories, with 66% of respondents stating that they plan for IT outages. Recommendations Organizations should focus their disaster recovery plans specifically on the recovery of IT services, and should clearly define the intended use and scope of the plan as a critical first step. Two to three senior executives in the organization should be authorized to make a disaster declaration, and only after specific criteria have been met to qualify the event as a disaster. Organizations should include the details of ongoing recovery operations and failback processes and procedures as highlighted sections in the disaster recovery plan. Analysis IT organizations spend considerable time and money developing and managing IT disaster recovery plans they hope will reduce downtime and minimize the business impact when a disaster arises. Although most large organizations claim to have some form of IT disaster Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24... 1 of 7 9/23/12 4:09 PM

Upload: geekmodeboy

Post on 22-Jan-2015

702 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Improve your it disaster recovery plan, and your ability to recover from disaster

This research note is restricted to the personal use of Aristotle Castro ([email protected]).

Improve Your IT Disaster Recovery Plan,and Your Ability to Recover From Disaster4 June 2012 | ID:G00234709

Kevin Knox

Many organizations have inconsistent IT disaster recovery plans that vary in quality, scopeand detail. We help disaster recovery and business continuity planners improve their ITdisaster recovery plans, and their ability to recover from disaster, by outlining bestpractices for key problems.

Overview

Explore related content:

"SMB Context: 'Improve Your IT Disaster Recovery Plan, and Your Ability to Recover FromDisaster.'" (17 September 2012)

Key Challenges

Minor discrepancies, omissions and oversights in an organization's disaster recoveryplan can have a major impact on the time required to recover from a disaster andthe associated business impact.

While most organizations claim to have some form of IT disaster recovery plan inplace, there are wide-ranging differences in quality, scope and detail level from oneplan to another.

Respondents to the 2011 Gartner Risk Management Disciplines Survey were askedwhich types of disasters their organizations planned for. IT outage was rankedhighest among the 13 categories, with 66% of respondents stating that they planfor IT outages.

Recommendations

Organizations should focus their disaster recovery plans specifically on the recoveryof IT services, and should clearly define the intended use and scope of the plan as acritical first step.

Two to three senior executives in the organization should be authorized to make adisaster declaration, and only after specific criteria have been met to qualify theevent as a disaster.

Organizations should include the details of ongoing recovery operations and failbackprocesses and procedures as highlighted sections in the disaster recovery plan.

Analysis

IT organizations spend considerable time and money developing and managing IT disasterrecovery plans they hope will reduce downtime and minimize the business impact when adisaster arises. Although most large organizations claim to have some form of IT disaster

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

1 of 7 9/23/12 4:09 PM

Page 2: Improve your it disaster recovery plan, and your ability to recover from disaster

recovery plan in place — based on the numerous plan reviews Gartner performs each year— there are significant differences in quality, scope and detail level from one plan toanother. Disaster recovery plans should be specific enough to address the individualrecovery requirements, technologies and processes of an organization. Although no twoplans are exactly alike, there are certain issues all organizations should consider andmissteps to avoid when developing their plans.

Having a focused, detailed and well-organized disaster recovery plan can mean thedifference between smooth recovery operations and chaos during a disaster. This researchlooks at common mistakes organizations make within their IT disaster recovery plans, andprovides recommendations for improvement.

Define the Scope of the Plan

A common mistake organizations make when developing disaster recovery plans is notlimiting their scope exclusively to the recovery of IT services. For example, someorganizations include general business continuity requirements, which typically fall outsidethe purview of IT. Despite IT service recovery being a key part of overall businesscontinuity, each department should have its own plan, coordinated at a high level, butmanaged and owned separately.

Organizations should focus disaster recovery plans specifically on the recovery of ITservices, and should clearly define the intended use and scope of the plan as a critical firststep. This includes developing a concise statement about what's included and what's not,who the intended audience is and how the document should be used. The scope alsoshould identify the specific locations, businesses, companies and functions covered by therecovery plan.

Note: Business continuity management (BCM) ensures business resilience before, duringand after an operational disruption. BCM includes supplier management, crisismanagement, emergency management, IT disaster recovery management (IT DRM),business recovery, contingency planning and preparedness.

Identify Key Terminology

Most disaster recovery plans reviewed by Gartner fail to include a formal glossary of keyterminology and language. Because most recovery plans must address a wide variety ofindividuals with varying levels of knowledge from multiple internal and externalorganizations, an advanced understanding of language or terminology cannot be assumed.

A well-defined and easily accessible glossary of key terms and phrases should be includedin all disaster recovery plans. Establishing early in the recovery document a commonlanguage and terminology — including industry-specific terms, recovery terminology,commonly used acronyms, location and facility names, and abbreviations — helpsminimize misinterpretations and potential mistakes.

Make the Plan Easy to Use

Although it may seem a basic point, one constant with good disaster recovery plans is thatthey are well-organized, easily navigated and easy to use. Organizations often structuretheir recovery plans as novels instead of reference documents. Disaster recovery plans arerarely read from front to back, and are most likely to be used during a crisis, not as leisurereading beforehand.

To improve effectiveness and ease of use, organizations should separate their disasterrecovery plans into multiple, stand-alone sections or subdocuments. For example, arecovery planning section covers items such as methodologies, management and programgoals, while a recovery operations section focuses on recovery processes and procedures.Target each section to the specific audience or individual role, and format and organize the

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

2 of 7 9/23/12 4:09 PM

Page 3: Improve your it disaster recovery plan, and your ability to recover from disaster

plan for the targeted user and by content (see Table 1).

Source: Gartner (June 2012)

Reference Roles, Not Individuals' Names

Having an accurate and up-to-date recovery plan is critical for success. Unfortunately, it isnot uncommon for recovery plans to be out of date. Organizations typically do not updatetheir plans frequently enough to keep pace with the rate of personnel changes associatedwith the individuals who are assigned recovery responsibilities. This opens the door fortasks to be assigned to people who are no longer in the required role, have left thecompany or have changed their contact information.

Avoid the use of individuals' names and contact information in the recovery document, anduse roles and job titles instead. References to roles and job titles can be indexed againstan appendix of individual names and contact information. This way, only the appendixneeds to be updated on a regular basis, and can be achieved automatically via standardHR reports.

Address Ongoing Recovery and Failback, as Well as Failover

Most disaster recovery plans Gartner reviews focus almost exclusively on failoverprocesses and procedures. These plans usually fail to include adequate levels of detail, ifany details are addressed at all, on what should happen in operations after a disasterfailover occurs, or on re-establishing production operations via failback.

Ongoing recovery operations and failback procedures are almost as important as failover,and should be covered in detail in all disaster recovery plans. Organizations should ensurethat disaster postmortem processes are established to understand the root cause of thedisaster and how it impacted IT, and to assess recovery performance.

Consider the Types of Disasters to Plan For

What types of disasters should organizations planned for? Two common approaches toanswering this question are:

One size fits all — where all types of disaster scenarios are treated the same

Individual subplans to address a wide array of potential disaster scenarios

While there is no right answer, many recovery plans we review are overly general or toocomprehensive and complex.

Organizations should plan for disaster scenarios based on their ability to manage and

Table 1. Recovery Planning and Recovery Operation:Document Differences

Item Recovery Planning Recovery Operations

Target IT leaders IT operations

Formatting Paragraphs and sections Bulleted lists

Order Varied Sequential

Writing Detailed Straightforward and concise

Indexed Not important Highly important

Knowledge assumption High Low

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

3 of 7 9/23/12 4:09 PM

Page 4: Improve your it disaster recovery plan, and your ability to recover from disaster

benefit from including the various scenarios. Scenarios based on criteria such asnotification time (e.g., a tornado warning is in effect starting tomorrow at 12 noon), typeof disaster and potential business impact should be established only if material differencesexist in the way the type of disaster is managed. Organizations should avoid planning fordisasters that are highly unlikely to occur (e.g., a blizzard in the Caribbean).

Figure 1 shows 2011 Gartner Risk Management Disciplines Survey respondents' answers tothe question, "What disaster scenarios does your organization plan for in its businesscontinuity management efforts?"

Figure 1. Common Disasters Organizations Plan for in BCM Efforts

N = 159

Source: Gartner (June 2012)

Maintain Version and Configuration Control

Maintaining consistency between production and recovery environments remains one ofthe biggest disaster recovery testing and exercising challenges organizations face. Whileconfiguration and asset management tools can help, few organizations use them or othertools as part of ongoing disaster recovery plan updates.

Establish formal processes via the use of management tools and libraries, or manually, to

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

4 of 7 9/23/12 4:09 PM

Page 5: Improve your it disaster recovery plan, and your ability to recover from disaster

ensure that all hardware and software references in a disaster recovery plan are up todate, and represent actual production and recovery configurations. Specific version andpatch-level details should be included for all hardware, software and OSs, and these shouldbe updated on a regular basis. For example, it is insufficient to state Windows 2000 in therecovery plan for a server running Windows 2000 Advanced Server Service Pack 4.

Codify What Constitutes a Disaster

Defining what qualifies as a disaster and how it is declared are key considerations notcovered by most recovery plans in adequate detail or focus. Yet, this is especiallyimportant, given the cost and potential level of disruption associated with declaring adisaster.

Organizations must ensure that processes and safeguards are established and documentedwithin the disaster recovery plan to protect against mistaken declarations. Two to threesenior executives should be authorized to declare a disaster, and this should occur onlyafter specific criteria have been met to qualify the event as a disaster. Similar processesand criteria should be established to declare the end of a disaster, and to initiate failbackprocedures.

Include Testing in the Disaster Recovery Plan

Disaster recovery testing is challenging and expensive, but is a critical component ofdisaster recovery preparedness. Given the time and money spent on disaster recoverytesting, it is surprising we don't see it called out more regularly or covered in enough detailwithin disaster recovery plans.

Testing should be a highlighted section of all disaster recovery plans, and should includespecific details, such as when it is scheduled throughout the year, what types of tests areplanned, which applications or business functions will be tested, and what testingprocesses and procedures should be followed. Besides physical recovery testing,organizations should establish a regular "paper test" schedule of when major reviews andwalk-throughs of the recovery plan occur (see "Best Practices for Planning and ManagingDisaster Recovery Testing").

Consider the Communication Infrastructure

The communication infrastructure is a top recovery priority for many organizations.However, since it is not necessarily seen as an application or a business service, it is notalways called out or prioritized appropriately within disaster recovery plans.

The communication infrastructure should be considered a high-priority recovery function,and treated similarly to other mission-critical business services. This is especially importantwhen business continuity functions such as an emergency response system might dependon the availability of the communication infrastructure for operation. Even for execution ofthe recovery plan, primary and alternative communication methods should be establishedand documented.

Recommended Reading

Some documents may not be available as part of your current Gartner subscription.

"Best Practices for Planning and Managing Disaster Recovery Testing"

"Ten Best Practices for Creating and Maintaining Effective Business Continuity ManagementPlans"

"Define, Develop and Verify Plans for Application Availability and Recoverability"

"Recent IT Outages Beg the Question: Who's Minding the Data?"

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

5 of 7 9/23/12 4:09 PM

Page 6: Improve your it disaster recovery plan, and your ability to recover from disaster

"New Evaluation Criteria and Provider Capabilities Are Changing Disaster RecoverySourcing"

Evidence

This research is the result of over 40 disaster recovery document reviews and analyses, aswell as direct discussions with Gartner clients regarding the creation and management ofdisaster recovery documents and plans.

© 2012 Gartner, Inc. and/or its Affiliates. All Rights Reserved. Reproduction and distribution of this publicationin any form without prior written permission is forbidden. The information contained herein has been obtainedfrom sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness oradequacy of such information. Although Gartner's research may discuss legal issues related to the informationtechnology business, Gartner does not provide legal advice or services and its research should not beconstrued or used as such. Gartner shall have no liability for errors, omissions or inadequacies in theinformation contained herein or for interpretations thereof. The opinions expressed herein are subject tochange without notice.

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

6 of 7 9/23/12 4:09 PM

Page 7: Improve your it disaster recovery plan, and your ability to recover from disaster

Print Document http://my.gartner.com/portal/server.pt/gateway/PTARGS_0_24...

7 of 7 9/23/12 4:09 PM