bc_dr

Upload: paradescartar

Post on 03-Jun-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 BC_DR

    1/54

  • 8/11/2019 BC_DR

    2/54

    AcknowledgmentsMaterial is sourced from:

    CISA Review Manual 2009, 2008, ISACA. All rights reserved. Used bypermission.

    CISA Certified Information Systems Auditor All-in-One Exam Guide, PeterH Gregory, McGraw-Hill

    Author: Susan J Lincke, PhD

    Univ. of Wisconsin-Parkside

    Reviewers/Contributors: Todd Burri & Megan Reid

    Funded by National Science Foundation (NSF) Course, Curriculum andLaboratory Improvement (CCLI) grant 0837574: Information Security: Audit,Case Study, and Service Learning.

    Any opinions, findings, and conclusions or recommendations expressed in thismaterial are those of the author and/or source(s) and do not necessarilyreflect the views of the National Science Foundation.

  • 8/11/2019 BC_DR

    3/54

    Imagine a company

    Bank with 1 Million accounts, socialsecurity numbers, credit cards, loans

    Airline serving 50,000 people on 250flights daily

    Pharmacy system filling 5 million

    prescriptions per year, some of theprescriptions are life-saving

    Factory with 200 employees producing200,000 products per day using robots

  • 8/11/2019 BC_DR

    4/54

    Imagine a system failure

    Server failure

    Disk System failure

    Hacker break-in

    Denial of Service attack Extended power failure

    Snow storm

    Spyware

    Malevolent virus or worm

    Earthquake, tornado Employee error or revenge

    How will this affect eachbusiness?

  • 8/11/2019 BC_DR

    5/54

    First Step:

    Business Impact Analysis Which business processes are of strategic

    importance?

    What disasters could occur? What impact would they have on the

    organization financially? Legally? On

    human life? On reputation? What is the required recovery time period?

    Answers obtained via questionnaire,interviews, or meeting with key users of IT

  • 8/11/2019 BC_DR

    6/54

    Event Damage Classification

    Negligible: No significant cost or damage

    Minor: A non-negligible event with no material or

    financial impact on the businessMajor: Impacts one or more departments and mayimpact outside clients

    Crisis: Has a major material or financial impact onthe business

    Minor, Major, & Crisis events should be

    documented and tracked to repair

  • 8/11/2019 BC_DR

    7/54

    Workbook:

    Disasters and ImpactProblematic Event

    or DisasterAffected Business Process(es)

    (Assumes a university)

    Impact Classification &Effect on finances, legal

    liability, human life,reputation

    Fire Class rooms, business departments Crisis, at times Major,Human life

    Hacking Attack Registration, advising, Major,

    Legal liability

    Network Unavailable Registration, advising, classes,

    homework, education

    Crisis

    Social engineering,/Fraud

    Registration, Major,

    Legal liability

    Server Failure(Disk/server)

    Registration, advising, classes,homework, education.

    Major, at times: Crisis

  • 8/11/2019 BC_DR

    8/54

    Recovery Time: TermsInterruption Window: Time duration organization can waitbetween point of failure and service resumption

    Service Delivery Objective (SDO): Level of service in AlternateMode

    Maximum Tolerable Outage: Max time in Alternate Mode

    Regular Service

    Alternate Mode

    Regular

    Service

    Interruption

    Window

    Maximum Tolerable Outage

    SDO

    Interruption

    Time

    Disaster

    Recovery

    Plan Implemented

    Restoration

    Plan Implemented

  • 8/11/2019 BC_DR

    9/54

    Definitions

    Business Continuity: Offer critical services inevent of disruption

    Disaster Recovery: Survive interruption tocomputer information systems

    Alternate Process Mode: Service offered bybackup system

    Disaster Recovery Plan (DRP): How to transitionto Alternate Process Mode

    Restoration Plan: How to return to regular systemmode

  • 8/11/2019 BC_DR

    10/54

    RPO and RTO

    Recovery Point Objective Recovery Time Objective

    How far back can you fail to? How long can you operate without a system?

    One weeks worth of data? Which services can last how long?

    1 2

    Hours24

    HoursOne

    Week

    One

    DayOne

    Hour

    Interruption

  • 8/11/2019 BC_DR

    11/54

    Recovery Point Objective

    Mirroring:

    RAID

    Backup

    Images

    Orphan Data: Data which is lost and never recovered.RPO influences the Backup Period

  • 8/11/2019 BC_DR

    12/54

    Business Impact Analysis

    SummaryService Recovery

    Time

    Objective

    (Hours)

    Recovery

    Point

    Objective

    (Hours)

    Critical

    Resources

    (Computer,

    people,

    peripherals)

    Special Notes

    (Unusual treatment at

    Specific times, unusual risk

    conditions)

    Registration 4 hours 0 hours SOLAR,network

    Registrar

    High priority during Nov-Jan,

    March-June, August.

    Personnel 8 hours 2 hours PeopleSoft Can operate manually forsome time

    Teaching 1 hour 1 day D2L, network,faculty files

    During school semester: highpriority.

    Work

    Book

    Partial BIA for a university

  • 8/11/2019 BC_DR

    13/54

    Disruption vs. Recovery Costs

    Cost

    Time

    Service Downtime

    Alternative Recovery Strategies

    Minimum Cost

    * Hot Site

    * Warm Site

    * Cold Site

  • 8/11/2019 BC_DR

    14/54

    Alternative Recovery Strategies

    Hot Site: Fully configured, ready to operate within hours

    Warm Site: Ready to operate within days: no or low powermain computer. Does contain disks, network, peripherals.

    Cold Site: Ready to operate within weeks. Containselectrical wiring, air conditioning, flooring

    Duplicate or Redundant Info. Processing Facility:Standby hot site within the organization

    Reciprocal Agreementwith another organization ordivision

    Mobile Site: Fully- or partially-configured trailer comes toyour site, with microwave or satellite communications

  • 8/11/2019 BC_DR

    15/54

    Hot Site

    Contractual costs include: basic subscription,

    monthly fee, testing charges, activation costs,

    and hourly/daily use charges Contractual issues include: other subscriber

    access, speed of access, configurations, staff

    assistance, audit & test

    Hot site is for emergency usenot long term

    May offer warm or cold site for extended

    durations

  • 8/11/2019 BC_DR

    16/54

    Reciprocal Agreements

    Advantage: Low cost

    Problems may include:

    Quick access Compatibility (computer, software, )

    Resource availability: computer, network, staff

    Priority of visitor

    Security (less a problem if same organization) Testing required

    Susceptibility to same disasters

    Length of welcomed stay

  • 8/11/2019 BC_DR

    17/54

    Network Disaster Recovery

    Redundancy

    Includes:

    Routing protocols

    Fail-over

    Multiple paths

    Alternative Routing

    >1 Medium or

    > 1 network provider

    Diverse Routing

    Multiple paths,

    1 medium type

    Last-mile circuit protectionE.g., Local: microwave & cable

    Long-haul network diversityRedundant network providers

    Voice RecoveryVoice communication backup

  • 8/11/2019 BC_DR

    18/54

    RAIDData Mirroring

    ABCDABCD

    AB CD Parity

    AB CD

    RAID 0: Striping RAID 1: Mirroring

    Higher Level RAID: Striping & Redundancy

    Redundant Array of Independent Disks

  • 8/11/2019 BC_DR

    19/54

    RPO Controls

    Data File andSystem/Directory

    Location

    RPO(Hours)

    Special Treatment(Backup period, RAID, File

    Retention Strategies)

    Registration 0 hours RAID.Mobile Site?

    Teaching 1 day Daily backups.

    Facilities Computer Center as Redundantinfo processing center

    Work

    Book

  • 8/11/2019 BC_DR

    20/54

    Business Continuity Process

    Perform Business Impact Analysis

    Prioritize services to support critical businessprocesses

    Determine alternate processing modes forcritical and vital services

    Develop the Disaster Recovery plan for ISsystems recovery

    Develop BCP for business operations recoveryand continuation

    Test the plans

    Maintain plans

  • 8/11/2019 BC_DR

    21/54

    Classification of Services

    Critical $$$$: Cannot be performed manually.Tolerance to interruption is very low

    Vital $$: Can be performed manually for very shorttime

    Sensitive $: Can be performed manually for aperiod of time, but may cost more in staff

    Nonsensitive : Can be performed manually foran extended period of time with little additional

    cost and minimal recovery effort

  • 8/11/2019 BC_DR

    22/54

    Determine Criticality of Business

    ProcessesCorporate

    Sales (1) Shipping (2) Engineering (3)

    Web Service (1) Sales Calls (2)

    Product A (1)

    Product B (2)

    Product C (3)

    Product A (1)

    Orders (1)

    Inventory (2)

    Product B (2)

  • 8/11/2019 BC_DR

    23/54

    Question

    The amount of data transactions that are

    allowed to be lost following a computer

    failure (i.e., duration of orphan data) is the:1. Recovery Time Objective

    2. Recovery Point Objective

    3. Service Delivery Objective

    4. Maximum Tolerable Outage

  • 8/11/2019 BC_DR

    24/54

    Question

    When the RTO is large, this is associated

    with:

    1. Critical applications

    2. A speedy alternative recovery strategy

    3. Sensitive or nonsensitive services

    4. An extensive restoration plan

  • 8/11/2019 BC_DR

    25/54

    Question

    When the RPO is very short, the best

    solution is:

    1. Cold site

    2. Data mirroring

    3. A detailed and efficient Disaster

    Recovery Plan

    4. An accurate Business Continuity Plan

  • 8/11/2019 BC_DR

    26/54

    Disaster Recovery

    Disaster RecoveryTesting

  • 8/11/2019 BC_DR

    27/54

    An Incident Occurs

    Security officer

    declares disaster

    Call SecurityOfficer (SO)

    or committee

    member

    SO follows

    pre-establishedprotocol

    Emergency ResponseTeam: Human life:

    First concern

    Phone tree notifies

    relevant participants

    IT follows Disaster

    Recovery Plan

    Public relations

    interfaces with media

    (everyone else quiet)

    Mgmt, legal

    council act

  • 8/11/2019 BC_DR

    28/54

    Concerns for a BCP/DR Plan

    Evacuation plan: Peoples lives always take firstpriority

    Disaster declaration: Who, how, for what?

    Responsibility: Who covers necessary disasterrecovery functions

    Procedures for Disaster Recovery

    Procedures for Alternate Mode operation Resource Allocation: During recovery & continued

    operation

    Copies of the plan should be off-site

  • 8/11/2019 BC_DR

    29/54

    Disaster Recovery

    ResponsibilitiesGeneral Business

    First responder:Evacuation, fire, health

    Damage Assessment Emergency Mgmt

    Legal Affairs

    Transportation/Relocation/Coordination (people,

    equipment) Supplies

    Salvage

    Training

    IT-Specific Functions

    Software

    Application

    Emergency operations Network recovery

    Hardware

    Database/Data Entry

    Information Security

  • 8/11/2019 BC_DR

    30/54

    BCP DocumentsFocus: IT Business

    Event

    Recovery

    Disaster Recovery Plan

    Procedures to recover atalternate site

    Business Recovery Plan

    Recover business after adisaster

    IT Contingency Plan:Recovers majorapplication or system

    Occupant Emergency Plan:Protect life and assets duringphysical threat

    Cyber IncidentResponse Plan:

    Malicious cyber incident

    Crisis Communication Plan:

    Provide status reports to public

    and personnel

    BusinessContinuity

    Business Continuity Plan

    Continuity of Operations Plan

    Longer duration outages

  • 8/11/2019 BC_DR

    31/54

    Workbook

    Disaster Recovery PlanClassifica-

    tion(Critical or

    Vital)

    BusinessFunction

    Disaster orProblemEvent(s)

    Procedure for Handling(Section 5)

    Vital Registration Computer Failure If total failure,forward requests to UW-System

    Otherwise, use 1-week-old databasefor read purposes only

    Critical Teaching Computer Failure Faculty DB Recovery Procedure

    Sensitive Personnel Hacking attack,fraud, socialengineering

    Call Manager of ITNotify management.

    If hacking attack, bring DB off-line.Complete Incident event form

    Adhere to Breach Notification Law

  • 8/11/2019 BC_DR

    32/54

    MTBF = MTTF + MTTR

    Mean Time to Repair (MTTR)

    Mean Time Between Failure (MTBF)

    Measure of availability:

    5 9s = 99.999% of time working = 5

    minutes of failure per year.

    works repair works repair works

    1 day 84 days

  • 8/11/2019 BC_DR

    33/54

    Disaster Recovery

    Test ExecutionAlways tested in this order:

    Desk-Based Evaluation/Paper Test: A

    group steps through a paper procedure andmentally performs each step.

    Preparedness Test: Part of the full test isperformed. Different parts are testedregularly.

    Full Operational Test: Simulation of a fulldisaster

  • 8/11/2019 BC_DR

    34/54

    Business Continuity Test Types

    Checklist Review: Reviews coverage of planare allimportant concerns covered?

    Structured Walkthrough: Reviews all aspects of plan,

    often walking through different scenariosSimulation Test: Execute plan based upon a specific

    scenario, without alternate site

    Parallel Test: Bring up alternate off-site facility, without

    bringing down regular siteFull-Interruption: Move processing from regular site to

    alternate site.

  • 8/11/2019 BC_DR

    35/54

    Testing Objectives

    Main objective: existing plans will result insuccessful recovery of infrastructure & businessprocesses

    Also can:

    Identify gaps or errors

    Verify assumptions

    Test time lines Train and coordinate staff

  • 8/11/2019 BC_DR

    36/54

    Testing Procedures

    Tests start simple and

    become more challenging

    with progressInclude an independent 3rd

    party (e.g. auditor) to

    observe test

    Retain documentation for

    audit reviews

    Develop test

    objectives

    Execute Test

    Evaluate Test

    Develop recommendationsto improve test effectiveness

    Follow-Up to ensure

    recommendations

    implemented

  • 8/11/2019 BC_DR

    37/54

    Test Stages

    PreTest:Set the Stage

    Set up equipment

    Prepare staff

    Test:Actual test

    PostTest:Cleanup

    Returning resources

    Calculate metrics: Time required, %

    success rate in processing, ratio ofsuccessful transactions in Alternate modevs. normal mode

    Delete test data

    Evaluate plan

    Implement improvements

    PreTest

    Test

    PostTest

  • 8/11/2019 BC_DR

    38/54

    Gap Analysis

    Comparing Current Level with Desired Level

    Which processes need to be improved?

    Where is staff or equipment lacking?

    Where does additional coordination need

    to occur?

  • 8/11/2019 BC_DR

    39/54

    Insurance

    IPF &Equipment

    Data & Media Employee

    Damage

    Business Interruption:

    Loss of profit due to ISinterruption

    Valuable Papers &

    Records: Covers cashvalue of lost/damagedpaper & records

    Fidelity Coverage:

    Loss from dishonestemployees

    Extra Expense:

    Extra cost of operationfollowing IPF damage

    Media Reconstruction

    Cost of reproduction ofmedia

    Errors & Omissions:

    Liability for errorresulting in loss to client

    IS Equipment &Facilities: Loss of IPF &equipment due todamage

    Media Transportation

    Loss of data during xport

    IPF = Information Processing Facility

  • 8/11/2019 BC_DR

    40/54

    Auditing BCP

    Includes:

    Is BIA complete with RPO/RTO defined for all services?

    Is the BCP in-line with business goals, effective, and current?

    Is it clear who does what in the BCP and DRP?

    Is everyone trained, competent, and happy with their jobs? Is the DRP detailed, maintained, and tested?

    Is the BCP and DRP consistent in their recovery coverage?

    Are people listed in the BCP/phone tree current and do they have acopy of BC manual?

    Are the backup/recovery procedures being followed? Does the hot site have correct copies of all software?

    Is the backup site maintained to expectations, and are theexpectations effective?

    Was the DRP test documented well, and was the DRP updated?

  • 8/11/2019 BC_DR

    41/54

    Summary of BC Security

    Controls RAID

    Backups: Incremental backup, differential

    backup Networks: Diverse routing, alternative routing

    Alternative Site: Hot site, warm site, cold site,

    reciprocal agreement, mobile site

    Testing: checklist, structured walkthrough,

    simulation, parallel, full interruption

    Insurance

  • 8/11/2019 BC_DR

    42/54

    Question

    The FIRST thing that should be done when you

    discover an intruder has hacked into your computer

    system is to:

    1. Disconnect the computer facilities from the computer

    network to hopefully disconnect the attacker

    2. Power down the server to prevent further loss of

    confidentiality and data integrity.

    3. Call the manager.

    4. Follow the directions of the Incident Response Plan.

  • 8/11/2019 BC_DR

    43/54

    Question

    During an audit of the business continuityplan, the finding of MOST concern is:

    1. The phone tree has not been double-checked in 6 months

    2. The Business Impact Analysis has notbeen updated this year

    3. A test of the backup-recovery system isnot performed regularly

    4. The backup library site lacks a UPS

  • 8/11/2019 BC_DR

    44/54

    Question

    The first and most important BCP test is the:

    1. Fully operational test

    2. Preparedness test

    3. Security test

    4. Desk-based paper test

  • 8/11/2019 BC_DR

    45/54

    Question

    When a disaster occurs, the highest

    priority is:

    1. Ensuring everyone is safe

    2. Minimizing data loss by saving important

    data

    3. Recovery of backup tapes

    4. Calling a manager

  • 8/11/2019 BC_DR

    46/54

    Question

    A documented process where one

    determines the most crucial IT operations

    from the business perspective1. Business Continuity Plan

    2. Disaster Recovery Plan

    3. Restoration Plan

    4. Business Impact Analysis

  • 8/11/2019 BC_DR

    47/54

    Question

    The PRIMARY goal of the Post-Test is:

    1. Write a report for audit purposes

    2. Return to normal processing

    3. Evaluate test effectiveness and update

    the response plan

    4. Report on test to management

  • 8/11/2019 BC_DR

    48/54

    Question

    A test that verifies that the alternate sitesuccessfully can process transactions is

    known as:1. Structured walkthrough

    2. Parallel test

    3. Simulation test4. Preparedness test

  • 8/11/2019 BC_DR

    49/54

    VocabularyBusiness Continuity Plan (BCP), Business Impact Analysis(BIA), RAID, Disaster Recovery Plan (DRP)

    Hot site, warm site, cold site, reciprocal agreement, mobile

    site

    Interruption window, Maximum tolerable outage, Service

    delivery objective

    Recovery point objective (RPO)

    Recovery time objective (RTO)

    Desk based or paper test, preparedness test, fully

    operational test,Test: checklist, structured walkthrough, simulation test,

    parallel test, full interruption, pretest, post-test

    Diverse routing, alternative routing

    Incremental backup, differential backup

  • 8/11/2019 BC_DR

    50/54

    Interactive Crossword Puzzle

    To get more practice the vocabulary from

    this section click on the picture below. For

    a word bank look at the previous slide.

    Definitions adapted from:

    All-In-One CISA Exam Guide

  • 8/11/2019 BC_DR

    51/54

    HEALTH FIRST CASE STUDYBusiness Impact Analysis & Business Continuity

    Jamie Ramon MD

    Doctor

    Chris Ramon RD

    Dietician

    Terry

    Medical Admin

    Pat

    Software Consultant

  • 8/11/2019 BC_DR

    52/54

    Step 1: Define Threats

    Resulting in Business DisruptionKey questions: Which business processes are of strategic importance?

    What disasters could occur?

    What impact would they have on the organization

    financially? Legally? On human life? On reputation?

    Problematic

    Event or

    Disaster

    Affected

    Business

    Process(es)

    Impact Classification &

    Effect on finances, legalliability, human life,

    reputation

  • 8/11/2019 BC_DR

    53/54

    Step 2: Define Recovery Objectives

    Recovery Point Objective Recovery Time Objective

    1 2

    Hours24

    HoursOne

    Week

    One

    DayOne

    Hour

    Interruption

    Service

    RecoveryTime

    Objective

    (Hours)

    RecoveryPoint

    Objective

    (Hours)

    CriticalResources

    (Computer,people,

    peripherals)

    Special Notes(Unusual treatment at

    specific times, unusual riskconditions)

  • 8/11/2019 BC_DR

    54/54

    Business Continuity

    Step 3: Attaining Recovery Point Objective

    (RPO)

    Step 4: Attaining Recovery Time Objective(RTO)

    Classification

    (Critical orVital)

    Business

    Function

    Disaster or

    Problem Event(s)

    Procedure for Handling

    (Section 5)