disaster recovery plan

22
Disaster Recovery Plan

Upload: osama

Post on 05-Jul-2015

41 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Disaster Recovery Plan

Disaster Recovery PlanDisaster Recovery Plan

Page 2: Disaster Recovery Plan

What is Disaster Recovery?What is Disaster Recovery? Restoration of computing and

telecommunications services after an event has disrupted those services.

Events (Huge or small)– Earthquake – Terrorist attacks on the World Trade

Center, which killed thousands and affected everything from telephones to the New York Stock Exchange

– Malfunctioning software caused by a computer virus

Page 3: Disaster Recovery Plan

Why DRP?Why DRP?

A control might fail, or a threat might occur that management has not considered or that management has decided to accept as an exposure that cannot be covered via cost effective controls.

When disaster strikes, it still must be possible to recover operations and mitigate losses.

Organization are required to have a properly documented disaster recovery plan at least to lessen the effect of such like disaster.

Page 4: Disaster Recovery Plan

PurposePurpose Enable the Information Systems

function to restore operations Impact might be localized for example,

the PC user might accidentally delete critical data stored on a hard disk. The impact, however, might be wide spread; for example, an organization’s main frame computer installation might be destroyed by fire.

Page 5: Disaster Recovery Plan

Components of DRPComponents of DRP

DisasterRecovery

Plan

Emergency Plan

Backup Plan

Recovery Plan

Test Plan

Page 6: Disaster Recovery Plan

Emergency PlanEmergency Plan The emergency plan specifies the actions to be

undertaken immediately when a disaster occurs. Management must identify those situations that require the plan to be invoked - for example– Major fire– Major structural damage– Terrorist attack.

The actions to be initiated can vary somewhat depending on the nature of the disaster that occurs. For example, some disasters require that all personnel leave the information systems facilities immediately; others require a few select personnel remain behind for a short period to sound alarms, shut down equipment.

Page 7: Disaster Recovery Plan

Aspects of Emergency PlanAspects of Emergency Plan

The plan must show who is to be notified immediately when the disaster occurs management, police or fire department.

The plan must show any actions to be undertaken, such as shutdown of equipment, removal of files, and termination of power.

Any evacuation procedures required must be specified.

Return procedures (e.g.. conditions that must be met before the site is considered safe) must be designated.

Page 8: Disaster Recovery Plan

Backup PlanBackup Plan Backups must ensure

– Type– Frequency– Procedures– Location of backup resources– Restoration site– Personnel– Priorities– Time frame

Complex or straight forward backup plans

Page 9: Disaster Recovery Plan

Backup ResourcesBackup ResourcesResource Nature of Backup

Personnel

Training and rotation of duties among information systems staff so they can take the place of others. Arrangements with another company for provision of staff.

Hardware Outsourcing arrangements for hardware provision.

FacilitiesOutsourcing arrangements for the provision of facilities.

DocumentationInventory of documentation stored securely on site and off site.

SuppliesInventory of critical supplies stored securely on site and off site with list of vendors who provide all supplies.

Data/Information Inventory of files stored securely on site and off site.

Application softwareInventory of application software stored securely on site and off site.

System softwareInventory of system software stored securely on site and off site.

Page 10: Disaster Recovery Plan

Backup SitesBackup Sites Cold site: If an organization can tolerate

some downtime, cold-site backup might be appropriate. A cold site has all the facilities needed to install a mainframe system-raised floors, air conditioning, power, communications lines, and so on. The mainframe is not present, however, and it must be provided by the organization wanting to use the cold site. An organization can establish its own cold site facility or enter into an agreement with another organization to provide a cold site facility.

Page 11: Disaster Recovery Plan

Hot site: If fast recovery is critical, an organization might need hot-site backup. All hardware and operations facilities will be available at the hot site. In some cases, software, data, and supplies might also be stored there. Hot sites are expensive to maintain. They usually are shared with other organizations that have hot site needs.

Warm-site: A warm site provides an intermediate level of backup. It has all cold site facilities plus hardware that might be difficult to obtain or install. For example, a warm site might contain selected peripheral equipment plus a small mainframe with sufficient power to handle critical applications in the short run.

Page 12: Disaster Recovery Plan

Reciprocal Agreement:Two or more organizations might agree to provide backup facilities to each other in the event of one suffering a disaster. This, backup option is relatively cheap, but each participant must maintain sufficient capacity to operate another's critical systems. Reciprocal agreements are often informal in nature.1. How soon the site will be made available subsequent to a

disaster.2. The number of organizations that will be allowed to use the

site concurrently in the event of a disaster.3. The priority to be given to concurrent users of the site in the

event of a common disaster.4. The period during which the site can be used.5. The conditions under which the site can be used.6. The facilities and services the site provider agrees to make

available.7. What controls will be in place and working at the off-site

facility.

Page 13: Disaster Recovery Plan

Recovery PlanRecovery Plan Recovery plans set out procedures to restore full

information systems capabilities. Recovery plans depend on the circumstances:– disaster is global or localized– Nature of the machine,– Applications– data to be recovered.

Recovery committee works out the specifics of the recovery to be undertaken.

The plan should specify:– Responsibilities of the committee– Provide guidelines on priorities to be followed.

Page 14: Disaster Recovery Plan

Test PlanTest Plan Identify deficiencies in the emergency, backup, or

recovery plans or in the preparedness of an organization and its personnel in the event of a disaster.

Periodically, test plans must be invoked; that is, a disaster must be simulated and information systems personnel required to follow backup and recovery procedures.

To facilitate testing, a phased approach can be adopted. First, the disaster recovery plan can be tested by desk checking and inspection and walkthroughs, much like the validation procedures adopted for programs. A disaster can be simulated at a convenient time for example, during a slow period in the day.

Page 15: Disaster Recovery Plan

Business Continuity PlanBusiness Continuity Plan BCP is the act of proactively working out a way to

prevent and manage the consequences of a disaster, limiting it to the extent that a business can afford. Business continuity planning determines how a company will keep functioning until its normal facilities are restored after a disruptive event. This encompasses how employees will be contacted, where they will go and how they will keep doing their jobs.

Business Continuity is the exercise of recovering from an availability interruption or disaster event in minutes instead of days. The chart below depicts the delta between disaster recovery and business continuity.

Page 16: Disaster Recovery Plan

Time Minutes Hours Days

Periodic offsite

Backup

Periodic offsite

Backup

Periodic offsite

Backup

Periodic offsite

Backup

Restore Data from

Backups

Restore Data from

Backups

Identify & Enter Lost

Data

Identify & Enter Lost

Data

Resume Processing

Resume Processing

Continuous mirroring of data

to remote site

Continuous mirroring of data

to remote site

Perform target takeover and

resume processing

Perform target takeover and

resume processing Business

ContinuityPlanning

Traditional Disaster RecoveryPlanning

Page 17: Disaster Recovery Plan

KPI’sKPI’s Recovery Point Objective (RPO) – The pre-incident point

in time that data must be recovered to resume business transactions (acceptable transaction data loss).

Recovery Time Objective (RTO) – The maximum elapsed time required to recover data and processing capability.

Each of these KPIs craft the meaning and levels of service that organizations must consider when accessing business impact. 

Business Continuity describes the processes and procedures an organization puts in place to ensure that essential functions can continue during and after a disaster.

Page 18: Disaster Recovery Plan

Business Impact AnalysisBusiness Impact AnalysisBusiness impact analysis is performed to determine the

impacts associated with disruptions to specific functions or assets in a firm – these include

operating impact financial impact legal or regulatory impact.

For example, should billing, receivable, and collections business functions be crippled by inaccessibility of information, cash flow to the business will suffer. Additional risks are that lost customers will never return, the business’ credit rating may suffer, and significant costs may be incurred for hiring temporary help. Lost revenues, additional costs to recover, fines and penalties, overtime, application and hardware, lost good will, and delayed collection of funds could be the business impact of a disaster.

Page 19: Disaster Recovery Plan

Risk AnalysisRisk AnalysisRisk analysis identifies important functions and assets that are critical to a firm’s operations, and then subsequently establishes the probability of a disruption to those functions and assets. Once the risk is established, objectives and strategies to eliminate avoidable risks and minimize impacts of unavoidable risks can be set. A list of critical business functions and assets should first be compiled and prioritized. Following this, determine the probability of specific threats to business functions and assets. For example, a certain type of failure may occur once in 10 years. From a risk analysis, a set objectives and strategies to prevent, mitigate, and recover from disruptive threats should be developed.

Page 20: Disaster Recovery Plan

Disaster Recovery PlanDisaster Recovery Plan Disaster recovery plan is an IT-focused plan

designed to restore operability of the target systems, applications, or computer facility at an alternate site after an emergency. A DRP addresses major site disruptions that require site relocation. The DRP applies to major, usually catastrophic, events that deny access to the normal facility for an extended period. Typically, Disaster Recovery Planning involves an analysis of business processes and continuity needs; it may also include a significant focus on disaster prevention.

Page 21: Disaster Recovery Plan

Disaster ToleranceDisaster ToleranceDisaster tolerance defines an environment’s ability to withstand major disruptions to systems and related business processes. Disaster tolerance at various levels should be built into an environment and can take the form of hardware redundancy, high availability/clustering solutions, multiple data centers, eliminating single points of failure, and distance solutions.

Page 22: Disaster Recovery Plan

Bare Metal RecoveryBare Metal Recovery A bare metal recovery describes the process of restoring a

complete system, including system and boot partitions, system settings, applications, and data to their original state at some point prior to a disaster.

High Availability describes a system’s ability to continue processing and functioning for a certain period of time - normally a very high percentage of time, for example 99.999%. High availability can be implemented in IT infrastructure by reducing any single points-of-failure (SPOF), using redundant components. Similarly, clustering and coupling applications between two or more systems can provide a highly available computing environment.