problem management process ver1.0

Upload: deepak-rustagi

Post on 09-Feb-2018

246 views

Category:

Documents


9 download

TRANSCRIPT

  • 7/22/2019 Problem Management Process Ver1.0

    1/32

    1

    Loblaw

    IT Service Management Processes

    Problem Management Process

  • 7/22/2019 Problem Management Process Ver1.0

    2/32

    2

    Document Name: Problem Management Process

    Version History

    Version Name Comment(the reason for the increment to the version)

    Date

    1.00 Ali Alaswad 1st

    draft July 18, 2008

    Final

    Document Distribution Control

    Recipient Name Version Date

    Bill Charters 1.0 July , 2008

    Patrick Ma 1.0 July , 2008

    Dorota Mac 1.0 July , 2008

    Bobby Seebalack 1.0 July , 2008

  • 7/22/2019 Problem Management Process Ver1.0

    3/32

    3

    Table of Contents

    1. Process Goal.................................................................................................................... 42. Process Scope.................................................................................................................. 4

    3. Process Benefits.............................................................................................................. 4

    4. Process Overview............................................................................................................ 5

    4.1. Problem Management includes the following standard phases:............................ 5

    4.2. High Level process Flow (Reactive)........................................................................ 64.3. High Level process Flow (Proactive) ...................................................................... 6

    6. Process Interfaces with Other ITSM Processes............................................................... 8

    8. Roles and Responsibilities............................................................................................. 10

    9. Roles Assignment Matrix.............................................................................................. 12

    10. Problem Priorities....................................................................................................... 13

    11. Impact-Urgency Matrix............................................................................................... 14

    12. Problem Service Level Targets Definition................................................................... 14

    13. Major Problem Review............................................................................................... 15

    14. Known Error Database................................................................................................ 15

    15. Process Deliverables................................................................................................... 15

    16. Process Measurement (Metrics) and Reporting......................................................... 16

    16.1. Metrics................................................................................................................. 16

    17. Process Meetings........................................................................................................ 17

    17.1. Problem Management Meeting.......................................................................... 17

    17.2. Monthly Meeting................................................................................................. 19

    18. Process RACI Chart.................................................................................................. 2019. Process Detailed Description...................................................................................... 23

    20. Legend & Definitions................................................................................................... 32

    21. Attachments................................................................................................................ 32

  • 7/22/2019 Problem Management Process Ver1.0

    4/32

    4

    1. Process Goal

    To prevent problems and resulting incidents from happening, to eliminate recurring

    incidents and to minimize the impact of incidents that cannot be prevented.

    2. Process Scope

    Diagnose the root cause of incidents

    Determine the resolution to those problems.

    Ensuring that the resolution is implemented through the appropriate control

    procedures, especially Change Management and Release Management.

    Maintain information about problems and the appropriate workarounds and

    resolutions

    3. Process Benefits

    Improved IT service quality

    Incident volume reduction

    Improved knowledge base

    Permanent solutions

    Better service desk first-time fix rate (workaround)

    Works together with Incident Management and Change Management to ensure that

    IT service availability and quality are increased.

    Recording information about problems will speed up the resolution time and

    identify permanent solutions, reducing the number and resolution time of incidentsHigher productivity of business and IT staff

    Reduction in cost of effort in fire-fighting or resolving repeated incidents.

  • 7/22/2019 Problem Management Process Ver1.0

    5/32

    5

    4. Process Overview

    Problem Management process is one of the IT service Management processes that

    works very close to incident management process and Change Management process,

    There are two types of process activities; Reactive and Proactive.

    The Reactive activity is concerned with the detected errors (mainly from Incident

    Management Process) and required Root Cause Analysis, the outcome of this activity is

    a Known Error and workaround which is recorded in the Known Error Database.

    The Proactive activity is concerned with reviewing the Known Errors, Incident/Problem

    Reports (Patterns of failures/events), analyses the information and uses data collected

    by other IT Service Management Processes to identify trends or significant problems.

    The outcome of this activity is to provide solution to eliminate the error from happening

    again or to provide a workaround. It is driven as part of Continual Service Improvement.

    4.1. Problem Management includes the following standard phases:

    Problem

    Detection

    ProblemRecording

    (Prioritize &

    Categorize)

    Investigation and

    Diagnosis

    Create Known

    Error Record

    (Workaround)

    Resolution

    (Permanent

    Solution)

    F igure - 1

    Problem

    Closure

  • 7/22/2019 Problem Management Process Ver1.0

    6/32

    6

    4.2. High L evel process Flow (Reactive)

    4.3. High L evel process Flow (Proact ive)

    Problem

    Detected

    Problem

    Recorded

    Assign ProblemRecord to

    ProblemManager

    Review and

    ValidateAssign to

    Appropriate

    Workgroup

    Investigate and

    Isolate Root

    Cause

    Closure

    Of Problem

    Record

    Provide Solution

    ReviewData/Reports,

    Proactive

    Monitoring

    Create Problem

    Record

    Follow the SameSteps in the

    Reactive

    Workflow

    F igure - 2

    F igure - 3

    Review and

    Close Problem

    Record

    Update Known

    Error Database

    Verify

    Resolution

  • 7/22/2019 Problem Management Process Ver1.0

    7/32

    7

    5. Process Triggers

    Reaction to one or more incidents

    Problem triggered in testing

    Trend analysis in errors and faults Suppliers may trigger the need for some Problem Records through the

    notification of potential faults or known deficiencies in their products or services

    Availability Management, problem initiated to investigate, diagnosis and

    analyses on how to reduce downtime and increase uptime.

  • 7/22/2019 Problem Management Process Ver1.0

    8/32

    8

    6. Process Interfaces with Other ITSM Processes

    Problem interact with other processes as shown in the below diagram.

    Problem

    Management

    Process

    Change

    Management

    Configuration

    Management

    IncidentManagement

    Service Level

    Management

    Capacity

    Management

    Availability

    Management

    Request to participate in the post implementation review

    RFCs

    Configuration Item information in CMDB

    Entry to problem and known errors records

    Known Error Records, Workarounds, Problem Resolution

    Reports of problems and known errors by service

    SLA

    Resolutions for capacity related problems and known errors

    Reports of capacity related problems and known errors

    Availability reports used to indicate current or future problems

    Reports of availability related problems and known errors

    F igure - 4

    Release

    Management

    Logged incident against Configuration Item(s)

    Notice of release

    Reports of any problem(s) introduced by release

  • 7/22/2019 Problem Management Process Ver1.0

    9/32

    9

    7. Problem Policy

    Policy -1: Incident and Problem are two separate processes, but they are mostly using

    the same tools, similar categorization, impact and priority coding systems.

    Policy -2: Problem record can be created by anyone in IT or benefiting from IT services

    or providing services to IT.

    Policy -3: Problem is different from incident, problem created to isolate the cause when

    incident occur with an unknown cause, or created to eliminate the known cause of the

    incident (permanent solution) or created to prevent an incident from occurring.

    Policy -4: Each Problem Record documents the Lifecycle of a single Problem

    Policy -5: One centralized Tool for problem across the IT organization

    Policy -6: Problem Management should maintain information about problems and the

    appropriate workarounds and resolutions, all known errors and workarounds must be

    registered in the Known Errors Database.

    Policy -7: Problem management meeting should be conducted regularly (weekly),

    problem manager is accountable and responsible on facilitating and managing those

    meetings.

    Policy -8: There are two types of Problem Management Process activities Reactive and

    Proactive.

    Policy -9: All problems go through the process and all problem initiators must complete

    the required information in the problem record.

    Policy -10: Problem Manager is accountable on the complete problem life cycle and

    provides a single point of coordination.

    Policy -11: End user means all parties or individuals benefiting from IT services.

  • 7/22/2019 Problem Management Process Ver1.0

    10/32

    10

    8. Roles and Responsibilities

    Role ResponsibilitiesProcess Owner Owns the problem Management Process

    Defining the process strategy

    Ensuring that appropriate processdocumentation is available and current

    Defining appropriate policies and standards to be

    employed throughout the process

    Periodically auditing the process to ensure

    compliance to policy and standards

    Periodically reviewing the process strategy to

    ensure that it is still appropriate and change as

    required

    Communicating process information or changes

    as appropriate to ensure awareness Providing process resources to support activities

    required throughout the Service Management

    lifecycle

    Ensuring process implementers have the

    required knowledge and the required technical

    and business understanding to deliver the

    process, and understand their role in the process

    Reviewing opportunities for process

    enhancements and for improving the efficiency

    and effectiveness of the process

    Addressing issues with the running of the process

    Providing input to the ongoing service

    improvement plan.

    Process Manager/Problem

    Manager

    Liaison with all problem resolution groups to

    ensure swift resolution of problems within SLA

    targets

    Ownership and protection of the KEDB (Known

    Error Database)

    Gatekeeper for the inclusion of all Known Errors

    and management of search algorithms

    Formal closure of all Problem Records

    Liaison with suppliers, contractors, etc. to ensure

    that third parties fulfill their contractual

    obligations, especially with regard to resolving

    problems and providing problem-related

    information and data.

    Arranging, running, documenting and all follow-

  • 7/22/2019 Problem Management Process Ver1.0

    11/32

    11

    up activities relating to Major Problem Reviews

    (Critical/High priority)

    Ensure that the correct number and level of

    resources is available in the problem solving

    team.

    Validate problems and ensure it has been setwith the correct priority.

    Problem-Solving Group Investigates, diagnose and isolates the rootcause.

    Update known KEDB with known errors

    Develop corrective action plans to implement

    permanent solution.

    Escalate to problem manager on issues, risks and

    obstacles.

    Request 3rd party company (Suppliers/partners)

    involvement when is needed. Verify problem resolution with the initiator

    Update the problem record

    Create problem record as a proactive action to

    prevent incident from occurring.

    F igure - 5

  • 7/22/2019 Problem Management Process Ver1.0

    12/32

    12

    9. Roles Assignment Matrix

    Role Name of

    Resources

    Locatio

    n

    Tel Email Time

    ZoneProcess Owner Patrick Ma Toronto 905-861- [email protected] EST

    Process Manager

    (Problem

    Manager)

    Patrick Ma Toronto 905-861- [email protected] EST

    Problem-Solving

    group(s)

    (IT Service

    Support

    Specialist)

    TBD

    TBD

    TBD

    TBD

    TBD

    TBD

    TBD

    TBDTBD

    TBD

    TBD

    TBD

    TBD

    TBD

    TBD

    TBD

    TBD

    TBDTBD

    TBD

    TBD

    TBD

    TBD

    Third party

    Companies

    (Suppliers/partne

    rs)

    TBD

    TBD

    TBD

    TBD

    TBDTBD

    TBD

    TBD

    TBD

    TBD

    TBD

  • 7/22/2019 Problem Management Process Ver1.0

    13/32

    13

    10. Problem Priorities

    Problem prioritized in the same way the incident is prioritized, it depends on the

    urgency and Impact and needs to take the below points into account to set the correct

    priority to a problem record.

    Can the system be recovered, or does it need to be replaced?

    How much will it cost?

    How many people, with what skills, will be needed to fix the problem?

    How long will it take to fix the problem?

    How extensive is the problem (e.g. how many CIs are affected)

    Critical:Complete outage or partial outage of service(s) or component(s) that stop oneor more of the Vital Business Functions causing significant loss of revenue or the ability

    to deliver important public services.

    Service(s) or Component(s) supporting a critical business process is down or not

    functioning correctly or one or several critical business processes are unavailable,

    affecting all users. There is no workaround

    High:Severely affecting some key users, or impacting on a large number of users.

    Service(s) or Component(s) is not down but there is a serious problem affecting a great

    majority of the users and their productivity or affecting an individuals ability to conduct

    business effectively. Work around (if provided) is awkward and inefficient.

    Medium: No severe impact

    Service(s) or Component(s) is not down but there is a problem affecting a small number

    of users. Business critical work can be performed. Acceptable workaround is available.

    Low:

    Service(s) or Component(s) is not down, business critical work can be performed, but acosmetic work would be beneficial.

  • 7/22/2019 Problem Management Process Ver1.0

    14/32

    14

    11. Impact-Urgency Matrix

    Impact

    Urgency

    High Medium Low

    High 1 2 3

    Medium 2 3 4

    Low 3 4 5

    Priorities

    12. Problem Service Level Targets Definition

    Code Priority Service Level Targets

    Accept

    ProblemRecord

    Apply Root Cause Analysis Permanent

    Resolution

    1 Critical 4 hr 48 hr N/A

    2 High 12 hr 4 days N/A

    3 Medium 48 hr 10 days N/A

    4 Low 7 days 21 days N/A

    5 Planned Planning

    F igure - 6

    F igure - 7

  • 7/22/2019 Problem Management Process Ver1.0

    15/32

    15

    13. Major Problem Review

    After every major problem (as determined by the priority definition), while memories

    are still fresh a review should be conducted to learn any lessons for the future.

    Specifically, the review should examine:

    Those things that were done correctly

    Those things that were done wrong

    What could be done better in the future

    How to prevent recurrence

    Whether there has been any third-party responsibility and whether follow-up

    actions are needed.

    Such reviews can be used as part of training and awareness activities for support staff

    and any lessons learned should be documented in appropriate procedures, work

    instructions, diagnostic scripts or Known Error Records. The Problem Manager facilitates

    the session and documents any agreed actions.

    It is recommended the review take place within three days from problem closure.

    14. Known Error Database

    The purpose of a Known Error Database is to allow storage of previous knowledge of

    incidents and problemsand how they were overcome to allow quicker diagnosis and

    resolution if they recur. The Known Error Record should hold exact details of the fault

    and the symptoms that occurred, together with precise details of any workaround or

    resolution action that can be taken to restore the service and/or resolve the problem.

    An incident count will also be useful to determine the frequency with which incidents

    are likely to recur and influence priorities, etc.

    15. Process Deliverables

    Rejected problem record

    Accepted problem record

    Known Error/Workaround

    Permanent solution

  • 7/22/2019 Problem Management Process Ver1.0

    16/32

    16

    16. Process Measurement (Metrics) and Reporting

    The below metrics used to judge the effectiveness and efficiency of the Problem

    Management process, or its operation:

    16.1. Metrics

    The total number of problems recorded in the period (as a control measure)

    The percentage of problems resolved within SLA targets (and the percentage

    that are not!)

    The number and percentage of problems that exceeded their target resolution

    times

    The backlog of outstanding problems and the trend (static, reducing or

    increasing?)

    The average cost of handling a problem The number of major problems (opened and closed and backlog)

    The percentage of Major Problem Reviews successfully performed

    The number of Known Errors added to the KEDB

    The percentage accuracy of the KEDB (from audits of the database)

    The percentage of Major Problem Reviews completed successfully and on time.

  • 7/22/2019 Problem Management Process Ver1.0

    17/32

    17

    17. Process Meetings

    17.1. Problem Management Meeting

    Title: Problem Management Meeting

    Purpose:

    The purpose of this meeting is to control and minimize the impact of incidents,

    problems and changes to the business environment that are caused by errors within the

    IT environment. Problem manager and other problem-solving group(s) meet to review

    problem records, problem trending and failed changes, and they ensure the root cause

    is isolated and a corrective action plan developed.

    Frequency:

    Weekly

    Role Players (Attendees):

    Problem Manager

    IT Lead team (Problem-Solving Group(s))

    3rd

    Party Companies (If required)

    Business Manager(s) (If required)

    Incident Manager (If required)

    Change Manager (If required)

    Agenda Content: Review open problem Records

    Problem records backlog

    Review the Root Cause Analysis assignment and progress

    Approve/Reject problem resolution

    Conduct a Major problem review (If any)

    Develop action plan for the outstanding problems

    Update records

    Close completed problem records

    Review problem management process performance (reports from the

    system) Process Improvement

    Improvement opportunities identified and discussed

    Meeting closure

    Review known errors requires permanent solution

    Agenda needs to be submitted at least 24 hours before the meeting to all invitees.

  • 7/22/2019 Problem Management Process Ver1.0

    18/32

    18

    Method of Communication:

    Face to face or,

    Conference Call (Tel Number: 1-88...) or,

    Electronically through a supporting tool and emails.

  • 7/22/2019 Problem Management Process Ver1.0

    19/32

    19

    17.2. Monthly Meeting

    Title:

    Monthly Process Governance Meeting

    Purpose:

    Overall review on process performance

    Identify gaps and develop actions plan to accommodate solutions

    Review report on changes created during the last month and outstanding

    incidents.

    To ensure that corrective action has been taken and that it was effective

    Frequency:

    Monthly

    Role Players:

    Problem Manager (Facilitator, prepare agenda and write minutes of

    meeting)

    Process Owner

    IT Directors and Vice Presidents (Infrastructure & Applications)

    Business operation representative

    Agenda Content:

    Comparison between required and actual performance

    Review business impacts and reports on total problem cost Reports on overall SLA performance (breaches vs. exceeding the agreed

    service level targets)

    Review the status of the actions assigned during previous meetings

    Develop action plan for the new outstanding issues

    Agenda will be submitted to the problem manager minimum two days

    before the meeting

    Method of Communication:

    Conference Call (Tel Number: 1-88...) Face to face

    Tools:

    Change Management System

    Repository for keeping meeting agenda and minutes

  • 7/22/2019 Problem Management Process Ver1.0

    20/32

    20

    18. Process RACI ChartStep Activity Problem

    Initiator

    Problem

    Manager

    Problem-

    Solving

    group

    3rdparty

    Company

    (Suppliers/

    Partners)

    1,2,3,

    4

    Problem triggered by Change management,

    Problem management (Proactive

    Activities), Incident management (Further

    Root Cause Analysis) , Incident

    management (Incident post incident

    review)

    Problem Initiated

    AR

    5 Create Problem Record AR

    6 Does this Problem Exist somewhere else in

    the environment?

    AR

    7 Create a Class Problem AR8 Categorize and Prioritize Problem

    9 Assign Problem Record to Problem

    Manager

    AR

    10 Problem Record Resides under the Problem

    Manager Queue

    AR

    11 Review and Validate Problem AR

    12 Valid Problem? AR

    13 Update & Close Problem Record AR

    14 Inform Problem Initiator I AR

    15 Duplicate Problem? AR

    16 Correct Priority? AR

    17 Set the Correct Priority AR

    18 Inform Problem Initiator I AR

    19 3rd Party Required? AR

    20 Assign Problem to Appropriate 3rd Party

    Company for Root Cause Analysis

    AR C

    21 Assign Problem to Appropriate IT Service

    Specialist for Root Cause Analysis

    AR

    22 Problem Record Reside Under the

    Appropriate Problem Management Queue

    AR

    23 Investigate and Isolate Root Cause AR

    24 Root Cause Found? AR

    25 Requires 3rd Party Company Participation? AR C

    26 Send Request 3rd Party Company AR I

    27 Escalate to Problem Manager I AR

    28 Document Root Cause and Mark as Known

    Error, Add to Known Error Database

    AR

  • 7/22/2019 Problem Management Process Ver1.0

    21/32

    21

    Step Activity Problem

    Initiator

    Problem

    Manager

    Problem-

    Solving

    group

    3rd

    party

    Company

    (Suppliers/

    Partners)

    29 Is Permanent Solution Available? AR

    30 Develop Corrective Action Plan AR31 Evaluate Each CA to Determine if Change is

    Required

    AR

    32 Change Required? AR

    33 Change Management Process

    Create RFC

    I AR

    34 Change Management Process

    Changes Tested

    AR

    35 Implement Corrective Action Plan AR

    36 Issues with CA Implementation? AR

    37 Notify Problem Manager I AR

    38 Verify CA Completion AR

    39 All CA Completed? AR

    40 Problem Resolved AR

    41 Problem Evolved from a Major Incident? AR

    42 Problem Resolution Needs to be Accepted

    by IT Team Leader

    AR

    43 Verify Problem Resolution with Initiator C AR

    44 Is solution accepted? C AR

    45 Update Problem Record AR

    46 Notify Problem Manager on Resolution I AR

    47 Proactive Activity

    Review Known Errors in Known Errors

    Database

    AR

    48 Can Provide Permanent Solution? AR

    49 Proactive Activity

    Re-open the problem record

    AR

    50 Proactive Activity

    Discover Potential Incident

    AR

    51 Prepare Problem Report, Send to Problem

    Manager for Review & Distribution

    AR

    52 Notify Problem Manager I AR

  • 7/22/2019 Problem Management Process Ver1.0

    22/32

    22

    Step Activity Problem

    Initiator

    Problem

    Manager

    Problem-

    Solving

    group

    3rd

    party

    Company

    (Suppliers/

    Partners)

    53 Conduct a Weekly Problem Management

    Meeting with ITLT

    AR

    54 Review Problem Status AR

    55 Problem Can be Closed by ITLT? AR

    56 Is it a Major Problem? AR

    57 Conduct a Major Problem Review AR C

    58 Update and Close Problem Record AR C

    59 Inform Problem-Solving Group; assure the

    Known Error Database Updated Accurately.

    AR C

    60 Inform Problem-Solving Group to

    Update CMDB-If apply

    AR I

    51 Develop Action Plan AR

    62 Monitor Implementation and Problem

    Status

    AR

    63 Receive Request for Root Cause Analysis AR

    64 Provide Assistance or Full Problem

    Resolution

    AR

    65 Can Provide Permanent Solution? AR

    66 Access Permitted to LCL Known Errors

    Database?

    C AR

    67 Inform LCL IT Service Specialist to add the

    Known Errors and Workaround to KEDB

    I AR

    68 Follow LCL Problem Management Process AR

    69 Add Known Errors and Workaround to

    KEDB

    (3rd Party Company)

    AR

    70 Inform LCL IT Service Specialist/Problem

    Manager on Problem Resolution

    I I AR

    Legend ExplanationR Responsible for the action but not necessarily an authority or

    approval

    A Accountable for the action, only one person

    C Consulted before or during the action

    I Informed

  • 7/22/2019 Problem Management Process Ver1.0

    23/32

    23

    19. Process Detailed Description

    Step Activity Explanation1,2,3

    ,4

    Problem triggered by

    Change management,

    Problem management

    (Proactive Activities),

    Incident management

    (Further Root Cause

    Analysis) , Incident

    management (Incident post

    incident review)

    Problem Initiated

    Problem can be triggered by followings:

    Change management; if testing or implementation of the

    change didnt go successfully and the change

    tester/implementer doesnt know the reason, a problem

    record created to find out the root cause of failure.

    Problem management (Proactive activities); by reviewing

    problem and incident reports, patterns of failures,

    monitoring IT infrastructure-alerts from systems, outcome

    of the problem or incident meeting , a problem record is

    created and it is not associated to an incident but it is

    created to prevent an incident from happening.

    Incident management (RCA); during the incidentmanagement lifecycle and in order to provide a resolution

    to an incident, a root cause analysis is required by the

    problem-solving group to investigate, diagnose and analyze

    deeper to isolate and identify the root cause of the incident.

    Incident management (Post Incident Review); after incident

    resolution of a critical or high incident a post review is

    conducted and one of the outcomes of this review is a

    problem record is created to identify the root cause of the

    incident.

    5Create Problem Record Problem record is created in the problem management system; therequester should complete the required information.

    Please see in Section 21 Attachment the Problem logging

    Template

    6 Does this Problem Exist

    somewhere else in the

    environment?

    The problem initiator should take the possibility of having the

    same problem exists somewhere else in the IT environment, the

    problem initiator work within his/her knowledge and can share the

    concern with others (higher expertise) to have the correct data.

    Example:

    Router Brand XX Model 123 experiencing problem after

    downloading a new version of software. There are five of them in IT

    infrastructure.

    7 Create a Class Problem A Class Problem Record is created to cover one single problem on

    multi Components.

  • 7/22/2019 Problem Management Process Ver1.0

    24/32

    24

    Step Activity Explanation8 Categorize and Prioritize Problem The problem initiator should select the correct category such as

    hardwarenetworkRouterBrand XX

    Problems must be categorized in the same way as incidents so that

    the true nature of the problem can be easily traced in the future

    and meaningful management information can be obtained.

    And needs to select the appropriate priority associated with this

    problem, depends on the urgency and the impact of the problem.

    9 Assign Problem Record to

    Problem Manager

    Problem record will be dispatched to the problem manager

    automatically after completing filling the required fields and

    submission.

    10 Problem Record Resides under the

    Problem Manager Queue

    The record will reside under the problem manager queue, waiting

    for the problem manager to open and review and proceed with the

    process.

    11 Review and Validate Problem Problem manager open and review the problem record, will look if

    the required information is completed, if it is a valid problem or not

    (Sometimes Incidents created as a problem and dispatched

    mistakenly to the problem manager)

    12 Valid Problem? If YES then GOTO activity 15

    If NO then Continue with activity 13

    13 Inform Problem Initiator Problem manager call by phone the problem initiator and explain

    to him/her the reasons behind the rejection, based on the problem

    definition and criteria this is not a problem it is an incident, advisethe initiator to create an incident record in the incident

    management system.

    14 Update & Close Problem Record Problem manager update the problem record with his/her reasons

    of rejecting this request, and close the record.

    Process END

  • 7/22/2019 Problem Management Process Ver1.0

    25/32

    25

    Step Activity Explanation15 Duplicate Problem? Problem manger checks if this problem has been created before for

    the same problem by the same initiator or different one, and this is

    a duplicate.

    In order to consider the record as a duplicate the following points

    needs to be taken in concern.

    The previous problem record must be still open

    Same configuration item and same problem description

    Associated to the same incident record or change record

    Call the initiator to confirm the duplication

    The problem manager depends on his knowledge of the

    existing opened problems and search in the system by the

    name of initiator or configuration item.

    The tool might give an informational message ( pop up)

    when the same problem initiator or configuration item

    exists in a previous opened record

    16 Correct Priority? The problem manager based on the agreed definition of priorities,

    review the current priority of the problem record

    If it is correct then GOTO activity 19

    If NO then Continue with activity 17

    17 Inform Problem Initiator Problem manager call by phone the problem initiator and informhim/her of the wrong selection of the priority.

    18 Set the Correct Priority Problem manager set the correct priority to the problem record

    based on priority definition.

    19 3r

    Party Required? If the problem exists in components or services managed or

    maintained by a 3rd

    party company or requires expertise that

    doesnt exists from inside the organization.

    If YES then Continue with activity 20

    If NO then GOTO activity 21

    20 Request the Loblaws

    Business Relationship

    Manager to Assign problem to

    the Appropriate 3rd Party

    Company for Root Cause

    Analysis

    Problem manager contact the business relationship manager toassign the problem record to the 3rd

    party company.

    If they have access on the problem management system then the

    problem manger will dispatch the record and call by phone to

    ensure and confirm receiving the record. If no access to the system

    is granted then problem manager send an email with the problem

    record details and call by phone.

    GOTO activity 63

  • 7/22/2019 Problem Management Process Ver1.0

    26/32

    26

    Step Activity Explanation21 Assign Problem to Appropriate

    Problem-Solving Group (IT Service

    Specialist for Root Cause Analysis)

    Problem manager assign the problem record to the appropriate

    Problem-Solving group depends on the category of the problem

    22 Problem Record Reside Under the

    Appropriate Problem

    Management Queue

    Problem record resides under the problem-solving group queue,

    they will receive an automatic notification by the system, the

    notification can be by an email and a message through the system.

    23 Investigate and Isolate Root Cause The problem-solving group receives the problem record, open and

    review the problem details, perform an investigation and diagnosis

    activities to isolate the root cause.

    An investigation should be conducted to try to diagnose the root

    cause of the problem the speed and nature of this investigation

    will vary depending upon the impact, severity and urgency of the

    problem but the appropriate level of resources and expertise

    should be applied to finding a resolution proportionate with thepriority code allocated and the service target in place for that

    priority level.

    There are many problem analysis, diagnosis and solving techniques

    available and much research has been done in this area. Some of

    the most useful and frequently used techniques include:

    Chronological Analysis

    Pain Value Analysis

    Kepner and Tregoe Brainstorming

    Ishikawa Diagrams

    Pareto Analysis

    24 Root Cause Found? If the problem-solving group found the root cause then GOTO

    activity 28

    If No the Continue with activity 25

    25 Requires 3rd

    Party Company

    Participation?

    If YES then Continue with activity 26

    If NO then GOTO activity 27

    26 Send Request to Loblaws

    Business Relationship

    Manager to assign problem to

    the appropriate 3rd

    Party

    Company

    Problem-solving group send the request to the business

    relationship manager to send it to the appropriate 3rd

    party

    company to assist in identifying the root cause.

    GOTO activity 63

  • 7/22/2019 Problem Management Process Ver1.0

    27/32

    27

    Step Activity Explanation27 Escalate to Problem Manager Neither the problem-solving group nor the 3

    rd party company can

    identify the root cause. The problem-solving group escalates to the

    problem manager as an issue and added item to the weekly

    problem management meeting, to be discussed and decide on the

    next step, develop action plan and the problem manager will follow

    up and monitor the implementation of those actions until issue

    resolved.

    GOTO activity 53

    28 Document Root Cause and Mark

    as Known Error, Add to Known

    Error Database

    The problem-solving group documents the identified root cause

    and registers the error in the Known Error Database as a known

    error.

    If during the root cause analysis activity a workaround is found,

    then it should be recorded in the problem record and keep theproblem record open, it is important that work on a permanent

    resolution continues where this is justified. If no work on

    permanent solution is planned then you can close the problem

    record.

    In the future and during the regular review of the known errors

    that pending for a permanent solution; a corrective action plan can

    be developed to provide a permanent solution if possible.

    29 Is Permanent Solution Available? If YES then Continue with activity 30

    If NO GOTO activity 51

    30 Develop Corrective Action Plan The problem-solving group develops a corrective action plan toeliminate the root cause permanently.

    The action plan contains but not limited to the following:

    Tasks

    Resources assigned against each task

    Timeline for each task

    Approver of each task

    Objective of the plan31 Evaluate Each CA to Determine if

    Change is Required

    The problem-solving group evaluates each corrective action to

    determine if change is required.

    32 Change Required? If YES then Continue with activity 33If NO then GOTO activity 35

    33 Change Management Process

    Create RFC

    The problem-solving group creates the request for change and

    follows the change management process to get the change

    assessed and approved

  • 7/22/2019 Problem Management Process Ver1.0

    28/32

    28

    Step Activity Explanation34 Change Management Process

    Changes Tested

    Change tested in the development environment before

    implementing it in the production environment.

    35 Implement Corrective Action Plan Corrective action plan implemented.

    36 Issues with CA Implementation? If YES then Continue with activity 37

    If NO then GOTO 38

    37 Notify Problem Manager The Problem-solving group notifies the problem manager with the

    issue(s) accompanied with the implementation.

    Issues such as shortage of resources, overtime payment,

    technology constraint, etc

    The problem manager will review and discuss the raised issue(s)

    during the problem management meeting and invite the concerned

    people to come up with an immediate solution or action plan.

    GOTO activity 53

    38 Verify CA Completion The problem-solving group verifies the corrective actions

    completion, to ensure all tasks are completed as per the plan and

    no task(s) is missed.

    39 All CA Completed? If YES then Continue with activity 40

    If NO then GOTO activity 52

    And in parallel activity GOTO activity 35

    40 Problem Resolved Problem resolved by implementing the corrective actions that

    eliminated the root cause and provided a permanent solution.

    41 Problem Evolved from a MajorIncident?

    If problem is evolved or associated with a critical or high priorityincident then Continue with activity 42

    If NO then GOTO activity 43

    42 Problem Resolution Needs to be

    Accepted by IT Team Leader

    The problem-solving group contacts by phone the IT team leader to

    verify and accept the solution before notifying the problem

    manager on resolution.

    GOTO activity 44

    43 Verify Problem Resolution with

    Initiator

    The problem is not associated with a major incident then the

    problem-solving group calls by phone and verify with the problem

    initiator.

    44 Is solution accepted? If solution is accepted then Continue with activity 45

    If solution is not accepted then GOTO activity 35 and in parallel

    activity notify the problem manager of the situation GOTO activity

    53.

  • 7/22/2019 Problem Management Process Ver1.0

    29/32

    29

    Step Activity Explanation45 Update Problem Record Problem-solving group update problem record with the details of

    the work done, it is recommended to attach the corrective action

    plan to the record and document the results.

    46 Notify Problem Manager on

    Resolution

    A notification will be send to the problem manager to inform

    him/her on resolution, the notification can be done in one or more

    of the following methods:

    Phone call (MUST)

    Email

    Automatic notification through the system

    The next step is to evaluate the resolution and close the record by

    the problem manager.

    GOTO activity 53

    47 Proactive Activity

    Review Known Errors in Known

    Errors Database

    The problem-solving group for a certain area in IT such as Network,

    Servers and active directory, Applications, Security etc, should

    review periodically the know errors with no permanent solution

    registered in the know errors Database within their technical area.

    The purpose of the review is to evaluate the possibility of providing

    a permanent solution to the problem.

    48 Can Provide Permanent Solution? If YES then Continue with activity 49

    If NO then continue monitoring and reviewing the registered

    known errors in KEDB GOTO activity 47

    49 Proactive Activity

    Re-open the problem record

    If the problem record was closed then re-open otherwise just

    follow the process.

    GOTO activity 30

    50 Proactive Activity

    Discover Potential Incident

    The Problem-solving group or members of the problem

    management meeting or by other IT staff a potential incident can

    be discovered, by reviewing the reports from the system or the

    alerts generated by the monitoring systems.

    GOTO activity 5

    51 Prepare Problem Report,

    Send to Problem Manager

    for Review & Distribution

    The problem-solving group prepare a problem report, explains the

    activities took place and the result of the root cause analysis, and

    the reasons of not proving the permanent solution.

    Send to the problem manager for his/her review and distribution.

    GOTO activity 53

  • 7/22/2019 Problem Management Process Ver1.0

    30/32

    30

    Step Activity Explanation52 Notify Problem Manager The problem-solving group notifies the problem manager when not

    all the corrective actions is completed and requires more time and

    effort to complete.

    GOTO activity 53

    53 Conduct a Weekly Problem

    Management Meeting with ITLT

    The problem manager conduct a weekly meeting with the IT lead

    Team and problem-solving group and others depends on the

    agenda item

    (Please see section 17.1 Process Weekly Meeting)

    54 Review Problem Status Review the opened problem record, the problem under solution,

    review issues and take action to resolve them, review reports

    generated from the system on the current status of the problems in

    the IT environment.

    55 Problem Can be Closed by ITLT? If problems provided with a solution and it can be closed then

    Continue with activity 56

    If problem still opened and requires intervene to expiate the work

    or to find solution to an outstanding issues keeping the problem

    open, then GOTO activity 61

    56 Is it a Major Problem? Is It a major problem, a problem with critical or high priority

    If Yes then Continue with activity 57

    If No GOTO activity 58

    57 Conduct a Major Problem Review The problem manager conduct a major problem review

    (See Section 13 major Problem Review)

    58 Update and Close Problem Record Change manager perform a check at this time to ensure that the

    record contains a full historical description of all eventsand if not,

    the record should be updated, then the Problem Record formally

    closed.

    59 Inform Problem-Solving Group to

    Assure the Known Error Database

    Updated Accurately.

    Change manger informs the problem-solving group to update the

    KEDB accurately.

    60 Inform Problem-Solving Group to

    Update CMDB-If apply

    Change manager notify configuration manager regarding any

    change in configuration item took place during the problem solving

    process.

    Process End

  • 7/22/2019 Problem Management Process Ver1.0

    31/32

    31

    Step Activity Explanation61 Develop Action Plan Change manager is accountable and responsible on developing an

    action plan to overcome issues or to expiate a problem solving

    activities.

    62 Monitor Implementation and

    Problem Status

    Change manager monitors the implementation of the action plan

    and problem status

    GOTO activity 55

    63 Receive Request for Root Cause

    Analysis

    The 3rd part company receives a request from LCL problem-solving

    group or from the problem manager.

    64 Provide Assistance or Full Problem

    Resolution

    The 3rd

    part company can participate partially or fully in the

    problem solving activities. (Partially means such as providing

    assistance to the LCL group in the root cause analysis or

    implementing the permanent solution)

    65 Can Provide Permanent Solution? Can the 3rd

    party company provide a permanent solution?

    If YES then GOTO activity 68If NO Continue with activity 66

    66 Access Permitted to LCL Known

    Errors Database?

    Access refers to the level and extent of a servicesfunctionality or

    data that a user is entitled to use.

    If the 3rd

    company have access on the KEDB then GOTO activity 69.

    If NO then Continue with activity 67

    67 Inform LCL IT Service Specialist to

    add the Known Errors and

    Workaround to KEDB

    The 3rd

    party company informs the problem-solving group to add

    the known error to the known error Database.

    68 Follow LCL Problem ManagementProcess

    The 3rd party company should follow LCL problem managementprocess such as attending the LCL problem management meeting,

    notifying the problem manager when required, adding the known

    errors in LCL KEDB (if access granted), updating the problem record

    (if access granted) and others.

    69 Add Known Errors and

    Workaround to KEDB

    (3rd

    Party Company)

    Errors added to the Known Error Database

    70 Inform LCL IT Service

    Specialist/Problem Manager on

    Problem Resolution

    Inform the problem-solving group and the problem manager on

    problem resolution

  • 7/22/2019 Problem Management Process Ver1.0

    32/32

    32

    20. Legend & Definitions

    Legend Explanation

    Problem A cause of one or more Incidents. The cause is not usually known at the time

    a Problem Record is created, and the Problem Management Process is

    responsible for further investigation.Workaround Reducing or eliminating the Impact of an Incident or Problem for which a full

    Resolution is not yet available. For example by restarting a failed

    Configuration Item. Workarounds for Problems are documented in Known

    Error Records. Workarounds for Incidents that do not have associated

    Problem Records are documented in the Incident Record.

    Known Error A Problem that has a documented Root Cause and a Workaround. Known

    Errors are created and managed throughout their Lifecycle by Problem

    Management. Known Errors may also be identified by Development or

    Suppliers.

    Trend Analysis Analysis of data to identify time-related patterns. Trend Analysis is used in

    Problem Management to identify common Failures or fragile Configuration

    Items, and in Capacity Management as a Modeling tool to predict future

    behavior. It is also used as a management tool for identifying deficiencies in

    IT Service Management Processes.

    Root Cause Analysis An Activity that identifies the Root Cause of an Incident or Problem. RCA

    typically concentrates on IT Infrastructure failures.

    Proactive Problem

    Management

    Part of the Problem Management Process. The Objective of Proactive

    Problem Management is to identify Problems that might otherwise be

    missed. Proactive Problem Management analyses Incident Records, and uses

    data collected by other IT Service Management Processes to identify trends

    or significant problems.

    21. Attachments

    Problem AnalysisTechniques.doc

    Problem Recordtemplate.doc