handout

34
RISK MANAGEMENT Risk Management includes the process of conducting risk management planning, identification, analysis, response planning and controlling risk on a project. The objectives of risk management are to increase the likelihood and impact of positive events, and decrease the likelihood and impact of negative events in the project. An overview of Risk Management processes are: Perform Qualitative Risk Analysis-The process of prioritising risks for further analysis or action by assessing and combining their probability of occurrence and impact. Plan Risk Responses-The process of developing options and actions to enhance opportunities and to reduce threats to project objectives. Control Risks- The process of implementing risk response plans, tracking identified risks, monitoring residual risks, identifying new risks, and evaluating risk process effectiveness throughout the project. Risk is an uncertain event or condition that, if it occurs, has a positive or negative effect on one or more project objectives such as scope, schedule, cost, and quality. A risk may have one or more causes and, if it occurs, it may have one or more impacts. A cause may be a given or potential requirement, assumption, constraint, or condition that creates the possibility of negative or positive outcomes. For example, causes could include the requirement of an environmental permit to do work, or having limited personnel assigned to design the project. The risk is that the permitting agency may take longer than planned to issue a permit; or, in the case of an opportunity, additional development personnel may become available who can participate in design, and they can be assigned to the project. If either of these uncertain events occurs, there may be an impact on the project, scope, cost, schedule, quality or performance. Risk conditions may include aspects of the project’s or organisation’s environment that contribute to project risk, such as immature project management practices, lack of integrated management systems, concurrent multiple projects, or dependency on external participants who are outside the project’s direct control.

Upload: shifa

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

RISK MANAGEMENT

Risk Management includes the process of conducting risk management planning, identification, analysis, response planning and controlling risk on a project. The objectives of risk management are to increase the likelihood and impact of positive events, and decrease the likelihood and impact of negative events in the project.

An overview of Risk Management processes are:

Perform Qualitative Risk Analysis-The process of prioritising risks for further analysis or action by assessing and combining their probability of occurrence and impact.

Plan Risk Responses-The process of developing options and actions to enhance opportunities and to reduce threats to project objectives.

Control Risks- The process of implementing risk response plans, tracking identified risks, monitoring residual risks, identifying new risks, and evaluating risk process effectiveness throughout the project.

Risk is an uncertain event or condition that, if it occurs, has a positive or negative effect on one or more project objectives such as scope, schedule, cost, and quality. A risk may have one or more causes and, if it occurs, it may have one or more impacts. A cause may be a given or potential requirement, assumption, constraint, or condition that creates the possibility of negative or positive outcomes. For example, causes could include the requirement of an environmental permit to do work, or having limited personnel assigned to design the project. The risk is that the permitting agency may take longer than planned to issue a permit; or, in the case of an opportunity, additional development personnel may become available who can participate in design, and they can be assigned to the project. If either of these uncertain events occurs, there may be an impact on the project, scope, cost, schedule, quality or performance. Risk conditions may include aspects of the project’s or organisation’s environment that contribute to project risk, such as immature project management practices, lack of integrated management systems, concurrent multiple projects, or dependency on external participants who are outside the project’s direct control.

PLAN RISK MANAGEMENT

Plan Risk Management is the process of defining how to conduct risk management activities for a project. The key benefit of this process is it ensures that the degree, type, and visibility of risk management are commensurate with both the risks and the importance of the project to the organisation. The risk management plan is vital to communicate with and obtain agreement and support from all stakeholders to ensure the risk management process is supported and performed effectively over the project life cycle.

PERFORM QUALITATIVE RISK ANALYSIS

Perform Qualitative Risk Analysis is the process of prioritising risks for further analysis or action by assessing and combining their probability of occurrence and impact. The key benefit of this process is that it enables project managers to reduce the level of uncertainty and to focus on high-priority risks.

Perform Qualitative Risk Analysis assesses the priority of identified risks using their relative probability or likelihood of occurrence, the corresponding impact on project objectives if the risks occur, as well as other factors such as the time frame for response and the organisation’s risk tolerance associated with the project constraints of cost, schedule, scope and quality. Such assessments reflect the risk attitude of the project team and other stakeholders. Effective assessment therefore requires explicit identification and management of the risk approaches of key participants in the Perform Qualitative Risk Analysis process. Where these risk approaches introduce bias into the assessment of identified risks, attention should be paid to identifying bias and correcting for it.

Establishing definitions of the levels of probability and impact can reduce the influence of bias. The time criticality of risk-related actions may magnify the importance of a risk. An evaluation of the quality of the available information on project risks also helps to clarify the assessment of the risk’s importance to the project.

Perform Qualitative Risk Analysis: Inputs

Risk Management Plan

Key elements of the risk management plan used in the Perform Qualitative Risk Analysis process include roles and responsibilities for conducting risk management, budgets, schedule activities for risk management, risk categories, definitions of probability and impact, the probability and impact matrix, and revised stakeholders’ risk tolerances. These inputs are usually tailored to the project during the Plan Risk process.

Risk Register

The risk register contains the information that will be used to assess and prioritise risks.

Organisational Process Assets

The organisational process assets that can influence the Perform Qualitative Risk Analysis process include information on prior, similar or completed projects.

Perform Qualitative Risk Analysis: Tools and Techniques

Risk Probability and Impact Assessment

Risk probability assessment investigates the likelihood that each specific risk will occur. Risk impact assessment investigates the potential effect on a project objective such as schedule, cost, quality, or performance, including both negative effects for threats and positive effects for opportunities.

Probability and impact are assessed for each identified risk. Risks can be assessed in interviews or meetings with participants selected for their familiarity with the risk categories on the agenda. Project team members and knowledgeable persons external to the project are included.

The level of probability for each risk and its impact on each objective is evaluated during the interview or meeting. Explanatory detail, including assumptions justifying the levels assigned, are also recorded. Risk probabilities and impacts are rated according to the definitions given in the risk management plan. Risks with low ratings of probability and impact will be included within the risk register as part of the watch list for future monitoring.

Probability and Impact Matrix

An organisation can rate a risk separately for each objective (e.g. cost, time, and scope). In addition, it may develop ways to determine one overall rating for each risk. Finally, opportunities and threats are handled in the same matrix using definitions of the different levels of impact that are appropriate for each.

The risk score helps guide risk responses. For example, risks that have a negative impact on objectives, otherwise known as threats if they occur, and that are in the high-risk zone of the matrix, may require priority action and aggressive response strategies. Threats found in the low-risk zone may not require proactive management action beyond being placed in the risk register as part of the watch list or adding a contingency reserve. Similarly for opportunities, those in the high-risk zone, which may be obtained most easily and offer the greatest benefit, should be targeted first. Opportunities in the low-risk zone should be monitored.

Risk Data Quality Assessment

Risk data quality assessment is a technique to evaluate the degree to which the data about risks is useful for risk management. It involves examining the degree to which the risk is understood and the accuracy, quality, reliability, and integrity of the data about the risk.

The use of low-quality risk data may lead to qualitative risk analysis of little use to the project. If data quality is unacceptable, it may be necessary to gather better data. Often, the collection of information about risks is difficult, and consumes more time and resources than originally planned. The numbers of steps in the scale are usually established when defining the risk attitude of the organisation.

Risk Categorisation

Risks to the project can be categorised by sources of risk to determine the areas of the project most exposed to the effects of uncertainty. Risks can also be categorised by common root cause. This technique helps determine work packages, project phases or even roles in the project, which can lead to the development of effective risk responses.

Risk Urgency Assessment

Risks requiring near-term responses may be considered more urgent to address. Indicators of priority may include probability of detecting the risk, time to affect a risk response, symptoms and warning signs, and the risk rating. In some qualitative analyses, the assessment of risk urgency is combined with the risk ranking that is determined from the probability and impact matrix to give a final risk severity rating.

Expert Judgement

Expert judgement is required to assess the probability and impact of each risk. Experts generally are those having experience with similar, recent projects. Gathering expert judgement is often accomplished with the use of risk facilitation workshops or interviews. The experts’ bias should be taken into account in this process.

Perform Qualitative Risk Analysis: Outputs

Project Documents Updates

Project documents that may be updated include, but are not limited to:

Risk Register Updates-As new information becomes available through the qualitative risk assessment, the risk register is updated. Updates to the risk register may include assessments of probability and impacts for each risk, risk ranking or scores, risk urgency information or risk categorisation, and a watch list for low probability risks or risks requiring further analysis.

Assumptions Log Updates-As new information becomes available the qualitative risk assessment, assumptions could change. The assumptions log needs to be revisited to accommodate this new information. Assumptions may be incorporated into the project scope statement or in a separate assumptions log.

PLAN RISK RESPONSES

Plan Risk Responses is the process of developing options and actions to enhance opportunities and to reduce threats to project objectives. The key benefit of this process is that it addresses the risks by their priority, inserting resources and activities into the budget, schedule and project management plan as needed.

The Plan Risk Responses process follows the Perform Quantitative Risk Analysis process (if used). Each risk response requires an understanding of the mechanism by which it will address the risk. This is the mechanism used to

analyse if the risk response plan is having the desired effect. It includes the identification and assignment of one person (an owner for risk response) to take responsibility for each agreed-to and funded risk response. Risk responses should be appropriate for the significance of the risk, cost-effective in meeting the challenge, realistic within the project context, agreed upon by all parties involved, and owned by a responsible person. Selecting the optimum risk response from several options is often required.

The Plan Risk Responses process presents commonly used approaches to planning responses to the risks. Risks include threats and opportunities that can affect project success, and responses are discussed for each.

Plan Risk Responses: Inputs

Risk Management Plan

Important components of the risk management plan include roles and responsibilities, risk analysis definitions, timing for reviews (and for eliminating risks from review), and risk thresholds for low, moderate, and high risks. Risk thresholds help identify those risks for which specific responses are needed.

Risk Register

The risk register refers to identified risks, root causes of risks, lists of potential responses, risk owners, symptoms and warning signs, the relative rating or priority list of project risks, risks requiring responses in the near term, risks for additional analysis and response, trends in qualitative analysis results, and a watch list, which is a list of low-priority risks within the risk register.

Plan Risk Responses: Tools and Techniques

Several risk response strategies are available. The strategy or mix of strategies most likely to be effective should be selected for each risk. Specific actions are developed to implement that strategy, including primary and backup strategies, as necessary. A fall-back plan can be developed for implementation if the selected strategy turns out not to be fully effective or if an unaccepted risk occurs. Secondary risks should also be reviewed. Secondary risks are risks that arise as a direct result of implementing a risk response. A contingency reserve is often allocated for time or cost. If developed, it may include identification of the conditions that trigger its use.

Strategies for Negative Risks or Threat

Three strategies, which typically deal with threats or risks that may have negative impacts on project objectives if they occur are: avoid, transfer, and mitigate. The fourth strategy, accept, can be used for negative risks or threats as well as positive risks or opportunities. Each of these risk response strategies have varied and unique influence on the risk condition. These strategies should be chosen to match the risk’s probability and impact on the project’s overall objectives. Avoidance and mitigation strategies are usually good strategies for

critical risks with high impact, while transference and acceptance are usually good strategies for threats that are less critical and with low overall impact. The four strategies for dealing with negative risks or threats are further described as follows:

Avoid-Risk avoidance is a risk response strategy whereby the project team acts to eliminate the treat or protect the project from its impact. It usually involves changing the project management plan to eliminate the threat entirely. The project manager may also isolate the project objectives from the risk’s impact or change the objective that is in jeopardy. Examples of this include extending the schedule, changing the strategy, or reducing scope. The most radical avoidance strategy is to shut down the project entirely. Some risks that arise early in the project can be avoided by clarifying requirements, obtaining information, improving communication, or acquiring expertise.

Transfer-Risk transference is a risk response strategy whereby the project team shifts the impact of a threat to a third party, together with ownership of the response. Transferring the risk simply gives another party responsibility for its management-it does not eliminate it. Transferring does not mean disowning the risk by transferring it to a later project or another person without their knowledge or agreement. Risk transference nearly always involves payment of a risk premium to the party taking on the risk. Transferring liability for risk is most effective in dealing with financial risk exposure. Transference tools can be quite diverse and include, but are not limited to, the use of insurance, performance bonds, warranties, guarantees, etc. Contracts or agreements may be used to transfer liability for specified risks to another party. For example, when a buyer has capabilities that the seller does not possess, it may be prudent to transfer some work and its concurrent risk contractually back to the buyer. In many cases, use of a cost-plus contract may transfer the cost risk to the buyer, while a fixed-price contract may transfer risk to the seller.

Mitigate-Risk mitigation is a risk response strategy whereby the project team acts to reduce the probability of occurrence or impact of a risk. It implies a reduction in the probability and/or impact of an adverse risk to be within acceptable threshold limits. Taking early action to reduce the probability and/or impact of a risk occurring on the project is often more effective than trying to repair the damage after the risk has occurred. Adopting less complex processes, conducting more tests, or choosing a more stable supplier are examples of mitigation actions. Mitigation may require prototype development to reduce the risk of scaling up from a bench-scale model of a process or product. Where it is not possible to reduce probability, a mitigation response might address the risk impact by targeting linkages that determine the severity. For example, designing redundancy into a system may reduce the impact from a failure of the original component.

Accept-Risk acceptance is a risk response strategy whereby the project team decides to acknowledge the risk and not take any action unless the risk occurs. This strategy is adopted where it is not possible or cost-effective to address a

specific risk in any other way. This strategy indicates that the project team has decided not to change the project management plan to deal with a risk, or is unable to identify any other suitable response strategy. This strategy can be either passive or active. Passive acceptance requires no action except to document the strategy, leaving the project team to deal with the risks as they occur, and to periodically review the treat to ensure that it does not change significantly. The most common active acceptance strategy is to establish a contingency reserve, including amounts of time, money, or resources to handle the risks.

Strategies for Positive Risks or Opportunities

Three of the four responses are suggested to deal with risks with potentially positive impacts on projects objectives. The fourth strategy, accept, can be used for negative risks or threats as well as positive risks or opportunities. These strategies, described below, are to exploit, share, enhance, and accept.

Exploit-The exploit strategy may be selected for risks with positive impacts where the organisation wishes to ensure that the opportunity is realised. This strategy seeks to eliminate the uncertainty associated with a particular upside risk by ensuring the opportunity definitely happens. Examples of directly exploiting responses include assigning an organisation’s most talented resources to the project to reduce the time to completion or using new technologies or technology upgrades to reduce cost and duration required to realise project objectives.

Enhance-The enhance strategy is used to increase the probability and/or the positive impacts on an opportunity. Identifying and maximising key drivers of these positive-impact risks may increase probability of their occurrence. Examples of enhancing opportunities include adding more resources to an activity to finish early.

Share-Sharing a positive risk involves allocating some or all of the ownership of the opportunity to a third party who is best able to capture the opportunity for the benefit of the project. Examples of sharing actions include forming risk-sharing partnerships, teams, special-purpose companies, or joint ventures, which can be established with the express purpose of taking advantage of the opportunity so that all parties gain from their actions.

Accept-Accepting an opportunity is being willing to take advantage of the opportunity if it arises, but not actively pursuing it.

Contingent Response Strategies

Some responses are designed for use only if certain events occur. For some risks, it is appropriate for the project team to make a response plan that will only be executed under certain predefined conditions, if it is believed that there will be sufficient warning to implement the plan. Events that trigger the contingency response, such as missing intermediate milestones or gaining higher priority with a supplier, should be defined and tracked. Risk responses

identified using this technique are often called contingency plans or fall-back plans and include identified triggering events that set the plans in effect.

Expert Judgement

Expert judgement is input from knowledgeable parties pertaining to the actions to be taken on a specific and defined risk. Expertise may be provided by any group or person with specialised education, knowledge, skill, experience, or training in establishing risk responses.

Plan Risk Responses: Outputs

Project Management Plan Updates

Elements of the project management plan that may be updated as a result of carrying out this process includes, but are not limited to:

Schedule Management Plan-The schedule management plan is updated to reflect changes in process and practice driven by the risk responses. This may include changes in tolerance or behaviour related to resource loading and levelling, as well as updates to the schedule strategy.

Cost Management Plan-The cost management plan is updated to reflect changes in process and practice driven by the risk responses. This may include changes in tolerance or behaviour related to cost accounting, tracking, and reports, as well as updates to the budget strategy and how contingency reserves are consumed.

Quality Management Plan-The quality management plan is updated to reflect changes in process and practice driven by the risk responses. This may include changes in tolerance or behaviour related to requirements, quality assurance, or quality control, as well as updates to the requirements documentation.

Procurement Management Plan-The procurement management plan may be updated to reflect changes in strategy, such as alterations in the make-or-buy decision or contract type(s) driven by the risk responses.

Human Resource Management Plan-The staffing management plan, part of the human resource management plan, is updated to reflect changes in project organisational structure and resource applications driven by the risk responses. This may include changes in tolerance or behaviour related to staff allocation, as well as updates to the resource loading.

Scope Baseline-Because of new, modified or omitted work generated by the risk responses, the scope baseline may be updated to reflect those changes.

Schedule Baseline-Because of new work (or omitted work) generated by the risk responses, the schedule baseline may be updated to reflect those changes.

Cost Baseline-Because of new work (or omitted work) generated by the risk responses, the cost baseline may be updated to reflect those changes.

CONTROL RISKS

Control Risks is the process of implementing risk response plans, tracking identified risks, monitoring residual risks, identifying new risks, and evaluating risk process effectiveness throughout the project. The key benefit of this process is that it improves efficiency of the risk approach throughout the project life cycle to continuously optimise risk responses.

Planned risk responses that are included in the risk register are executed during the life cycle of the project, but the project work should be continuously monitored for new, changing, and outdated risks.

The control risk process applies techniques, such as variance and trend analysis, which require the use of performance information generated during project execution. Other purposes of the Control Risks process are to determine if:

Project assumptions are still valid.

Analysis shows an assessed risk has changed or can be retired.

Risk management policies and procedures are being followed, and

Contingency reserves for cost or schedule should be modified in alignment with the current risk assessment.

Control risks can involve choosing alternative strategies, executing a contingency or fall-back plan, taking corrective action, and modifying the project management plan. The risk response owner reports periodically to the project manager on the effectiveness of the plan, any unanticipated effects, and any correction needed to handle the risk appropriately. Control Risks also includes updating the organisational process assets, including project lessons learned databases and risk management templates, for the benefit of future projects.

Control Risks: Inputs

Risk Register

The risk register has key inputs that include identified risks and risk owners, agreed-upon risk responses, control actions for assessing the effectiveness of response plans, risk responses, specific implementation actions, symptoms and warning signs of risk, residual and secondary risks, a watch list of low-priority risks, and the time and cost contingency reserves. The watch list is within the risk register and provides a list of low-priority risks.

Work Performance Data

Work performance data related to various performance results possibly impacted by risks includes, but is not limited to:

Deliverable status.

Schedule progress, and

Cost incurred.

Work Performance Reports

Work performance reports take information from performance measurements and analyse it to provide project work performance information including variance analysis, earned value data, and forecasting data. These data points could be impactful in controlling performance related risks.

Control Risks: Tools and Techniques

Risk Assessment

Control Risks are results in identification of new risks, reassessment of current risks, and the closing of risks that are outdated. Project Reassessments should be regularly scheduled. The amount and detail of repetition that are appropriate depends on how the project progresses relative to its objective.

Risk Audits

Risk audits examine and document the effectiveness of risk responses in dealing with identified risks and their root causes, as well as the effectiveness of the risk management process. The project manager is responsible for ensuring that risk audits are performed at an appropriate frequency, as defined in the project’s risk management plan. Risk audits may be included during routine project review meetings, or the team may choose to hold separate risk audit meetings. The format for the audit and its objectives should be clearly defined before the audit is conducted.

Variance and Trend Analysis

Many control processes employ variance analysis to compare the planned results to the actual results. For the purposes of controlling risks, trends in the project’s execution should be reviewed using performance information. Earned value analysis and other methods of project variance and trend analysis may be used for monitoring overall project performance. Outcomes from these analyses may forecast potential deviation of the project at completion from cost and schedule targets. Deviation from the baseline plan may indicate the potential impact of threats or opportunities.

Technical Performance Measurement

Technical performance measurement compares technical accomplishments during project execution to the schedule of technical achievement. It requires the definition of objective, quantifiable measures of technical performance, which can be used to compare actual results against targets. Such technical performance measures may include weight, transaction times, number of

delivered defects, storage capacity, etc. Deviation, such as demonstrating more or less functionality than planned at a milestone, can help to forecast the degree of success in achieving the project’s scope.

Meetings

Project risk management should be an agenda item at periodic status meetings. The amount of time required for that item will vary, depending upon the risks that have been identified, their priority, and difficulty of response. The more often risk management is practiced, the easier it becomes. Frequent discussions about risk make it more likely that people will identify risks and opportunities.

Control Risks: Outputs

Work Performance Information

Work performance information, as a Control Risks output, provides a mechanism to communicate and support project decision making.

Change Requests

Implementing contingency plans or workarounds sometimes results in a change request. Change requests are prepared and submitted to the Perform Integrated Change Control process. Change requests can include recommended corrective and preventive actions as well.

Recommended Corrective Actions-These are activities that realign the performance of the project work with the project management plan. They include contingency plans and workarounds. The latter are responses that were not initially planned, but are required to deal with emerging risks that were previously unidentified or accepted passively.

Recommended Preventive Actions-These are activities that ensure that future performance of the project work is aligned with the project management plan.

Project Management Plan Updates

If the approved change requests have an effect on the risk management processes, the corresponding component documents of the project management plan are revised and reissued to reflect the approved changes. The elements of the project management plan that may be updated are the same as those in the Plan Risk Responses process.

ACT OF GODAn event that directly and exclusively results from the occurrence of natural causes that could not have been prevented by the exercise of foresight or caution. An inevitable accident.

PHYSICAL DAMAGEIndustrial Accident

Construction collapse

Fire

Toxic release

Product or Service Failure

Product recall

Communications failure

Systems failure

Machine failure causes massive reduction in capacity

Faulty or dangerous goods

Health scare related to the product of industry

POLITICAL RISKPolitical risk refers to the complications businesses may face as a result of what are commonly referred to as political decisions—or “any political change that alters the expected outcome and value of a given economic action by changing the probability of achieving business objectives”. Political risk faced by firms can be defined as “the risk of a strategic, financial, or personnel loss for a firm because of such nonmarket factors as macroeconomic and social policies (fiscal, monetary, trade, investment, industrial, income, labour, and developmental), or events related to political instability (terrorism, riots, coups, civil war, and insurrection).”

Understanding risk partly as probability and partly as impact provides insight into political risk. For a business, the implication for political risk is that there is a measure of likelihood that political events may complicate its pursuit of earnings through direct impacts (such as taxes or fees) or indirect impacts (such as opportunity cost forgone). As a result, political risk is similar to an expected value such that the likelihood of a political event occurring may reduce the desirability of that investment by reducing its anticipated returns.

There are both macro- and micro-level political risks. While these are included in country risk analysis, it would be incorrect to equate macro-level political risk analysis with country risk as country risk only looks at national-level risks and also includes financial and economic risks. Micro-level risks focus on sector, firm, or project specific risk.

DESIGN

Maximum uptime is a philosophy. It begins with the planning of your facility and remains a continuous process through every step of design, construction, commissioning, operations, failure analysis and re-commissioning.

Design Failures

Design failures can be eliminated through proper planning and by engaging with competent vendors. Begin with the end in mind and come up with a design intent document that clearly spells out your requirements – in detail. Whether it’s new construction, upgrading or operating an existing mission-critical facility, it is important to carefully plan the work and to work the plan. It is also crucial to have a good design firm, integration firm, construction companies and commissioning team along with a well-trained operations staff to reduce failures.

Catastrophic Failures

A comprehensive maintenance and operations program can identify and eliminate as many potential problems, helping you avoid catastrophic failures. Your program should include well-defined maintenance windows, with appropriate redundancy built in so services are not interrupted while maintenance is performed. Predictive maintenance is another important consideration, which entails conducting a thorough failure analysis after each incident and using the results to predict and prevent future problems. It is also important to have a comprehensive training program for the operations and maintenance staff, starting with training from equipment manufacturers or installers but continuing with regular training to keep operational and maintenance staff current.

Compounding Failures

At times multiple events occur to create a failure, a situation known as a compounding failure. Lack of attention to details is a leading cause of compounding failures. Consider what happens should your data centre suffer a power outage. Your generator should receive a start signal and fire up immediately. But if you’ve neglected to check the generator battery, fuel and coolant levels for months on end, it may let you down. Similarly, little nuisance items in a large facility are sometimes left unnoticed and by themselves cause no ill effect to the facility, but along with other problems, can combine to create a system failure.

Human-error Failures

Human error is a leading cause of failures in mission-critical facilities. Training can help reduce the incidence of human failure but another requirement is detailed methods of procedure (MOPs). MOPs define in detail how to perform various maintenance functions, ensuring they are consistently performed in the same way. Too often, in the rush to bring the facility online, organizations fail to develop, document and deploy MOPs. These procedures should be developed early and tested before the facility is fully operational. Waiting to develop a procedure to transfer the UPS system to maintenance bypass could prove much more costly than investing the time upfront to prepare for the inevitable. MOPs should also be executed with a pilot/co-pilot approach, to ensure the procedure is followed to a T.

SUPPLIERSSupplier risk management is an evolving discipline in operations management for manufacturers, retailers, financial services companies and government agencies where the organization is highly dependent on suppliers to achieve business objectives. Outsourcing, globalization, lean supply chain initiatives and supplier rationalization have contributed to a highly fragmented model, where control is often several steps removed from the corporation. With an influx of Chief Risk Officers (CROs) coming from audit firms to fill newly-created CRO positions or update the talent of existing positions, the concept of risk narrative has grown, and continues to grow, in importance.

While these models have allowed companies to reduce overall costs and expand quickly into new markets, they also expose the company to the risk of a supplier suddenly going bankrupt, closing operations or being acquired.

Objectives

To overcome these challenges, companies mitigate supply chain interruptions and reduce risk with strategies and tactics that address supplier-centric risk at multiple stages in the relationship:

On boarding: Bringing suppliers into the operation with registration that includes:

A centralized supplier registration portal

Integration of third party performance, financial data and predictive indicators into the supplier profile.

Monitoring for stability beyond financial data, including:

Criminal and terrorists ties and operational performance

Visibility into potential disruptions caused by geopolitical threats, acts of nature, etc.

Cultivating strategic supplier relationships for the long-term:

Leverage supplier scorecards for continuous improvement

Establish and use benchmarks for measuring supplier performance

Creating a system for collaboration and supplier development.

Establish control across the extended enterprise:

Create integrated supplier networks

Extend performance management benchmarks to second and third tier suppliers.

Supplier Risk in Recession and Recovery

In 2008-2009, manufacturers experienced the startling speed at which suppliers can move from stability to shutting down operations. The devastating impact of a crucial supplier failure has moved risk management from add-on service to mission-critical. With a new focus on risk management, manufacturers have seen value whether the economy is stagnant or thriving.

With a transparent, accessible and comprehensive set of supplier information, manufacturers have been able to monitor suppliers for behavioural changes which contribute to overall stability, including:

Changes in the supplier’s management team

EPA violations

OSHA incidents

Quality issues

Noticeable lags in response time to inquiries

OFAC violations

Changes in any of these conditions can be defined as parameters for raising an alert. For example, a financially stable supplier may in fact be about to lose it CEO to retirement – which may cause issues within the management team. Early visibility into that change gives the manufacturer time to ensure it doesn’t negatively affect customers.

Based on the criticality of the supplier and the nature of the alert received, the manufacturer can then choose to take necessary action, such as calling or visiting the supplier, increasing monitoring, or moving towards terminating the relationship with the supplier and finding a replacement.

Benefits

Reducing supplier risk can:

Give insight to manufacturers to create defensive and offensive strategies that turn risk into a competitive advantage.

Help determine whether or not it is beneficial for a company to conduct a customer intervention and know in advance what the potential outcomes might be for an intervention.

Improve competitive position in the market.

Lower supplier costs.

Position manufacturers to better address customer needs by addressing supplier vulnerabilities before they become apparent

PEOPLEHuman Factors: Managing Human Failures

Everyone can make errors no matter how well trained and motivated they are. However in the workplace, the consequences of such human failure can be severe. Analysis of accidents and incidents shows that human failure contributes to almost all accidents and exposures to substances hazardous to health. Many major accidents e.g. Texas City, Piper Alpha, Chernobyl, were initiated by human failure.

In order to avoid accidents and ill-health, companies need to manage human failure as robustly as the technical and engineering measures they use for that purpose.

The challenge is to develop error tolerant systems and to prevent errors from initiating; to manage human error proactively it should be addressed as part of the risk assessment process, where:

Significant potential human errors are identified.

Those factors that make errors more or less likely are identified (such as poor design, distraction, time pressure, workload, competence, morale, noise levels and communication systems) – Performance Influencing Factors (PIFs).

Control measures are devised and implemented, preferably by redesign of the task or equipment.

This Key Topic is also very relevant when trying to learn lessons following an incident or near miss. This also involves identifying the human errors that led to the accident and those factors that made such errors more likely – PIFs.

Types of Human Failure

It is important to be aware that human failure is not random; understanding why errors occur and the different factors which make them worse will help you develop more effective controls. There are two main types of human failure: errors and violations.

A human error is an action or decision which was not intended. A violation is a deliberate deviation from a rule or procedure. The

following may be a helpful introduction.

Some errors are slips or lapses, often “actions that were not as planned” or unintended actions. They occur during a familiar task and include slips (e.g. pressing the wrong button or reading the wrong gauge) and lapses (e.g. forgetting to carry out a step in a procedure). These types of error occur commonly in highly trained procedures where the person carrying them out does not need to concentrate on what they are doing. These cannot be eliminated by training, but improved design can reduce their likelihood and provide a more error tolerant system.

Other errors are Mistakes or errors of judgement or decision-making where the “intended actions are wrong” i.e. where we do the wrong thing believing it to be right. These tend to occur in situations where the person does not know the correct way of carrying out a task either because it is new and unexpected, or because they have not be properly trained (or both). Often in such circumstances, people fall back on remembered rules from similar situations which may not be correct. Training based on good procedures is the key to avoiding mistakes.

Violations (non-compliances, circumventions, shortcuts and workarounds) differ from the above in that they are intentional but usually well-meaning failures where the person deliberately does not carry out the

procedure correctly. They are rarely malicious (sabotage) and usually result from an intention to get the job done as efficiently as possible. They often occur where the equipment or task has been poorly designed and/or maintained. Mistakes resulting from poor training (i.e. people have not been properly trained in the safe working procedure) are often mistaken for violations. Understanding that violations are occurring and the reason for them is necessary if effective means for avoiding them are to be introduced. Peer pressure, unworkable rules and incomplete understanding can give rise to violations.

There are several ways to manage violations, including designing violations out, taking steps to increase their detection, ensuring that rules and procedures are relevant/practical and explaining the rationale behind certain rules. Involving the workforce in drawing up rules increases their acceptance. Getting to the root cause of any violation is the key to understanding and hence preventing the violation.

Understanding different types of human failure can help identify control measures but you need to be careful you do not oversimplify the situation. In some cases it can be difficult to place an error in a single category – it may result from a slip or a mistake, for example. There may be a combination of underlying causes requiring a combination of preventative measures. It may also be useful to think about whether the failure is an error of omission (forgetting or missing out a key step) or an error of commission (e.g. doing something out of sequence or using the wrong control), and taking action to prevent that type of error.

The likelihood of these human failures is determined by the condition of a finite number of Performance Influencing Factors, such as design of interfaces, distraction, time pressure, workload, competence, morale, noise levels and communication systems.

Key Principles in Managing Human Failure

Human failure is normal and predictable. It can be identified and managed.

Industry should tackle error reduction in a structured and proactive way, with as much rigour as the technical aspects of safety. Managing human failure should be integral to the safety management system.

A poorly designed activity might be prone to a combination of errors and more than one solution may be necessary.

Involve workers in design of tasks and procedures.

Risk Assessment should identify where human failure can occur in safety critical tasks, the performance influencing factors which might make it more likely, and the control measures necessary to prevent it.

Incident Investigations should seek to identify why individuals have failed rather than stopping at ‘operator error’.

Common Pitfalls in Managing Human Failure

There is more to managing human failure in complex systems than simply considering the actions of individual operators. However, there is obvious merit in managing the performance of the personnel who play an important role in preventing and controlling risks, as long as the context in which this behaviour occurs is also considered.

When assessing the role of people in carrying out a task, be careful that you do not:

Treat operators as if they are superhuman, able to intervene heroically in emergencies.

Assume that an operator will always be present, detect a problem and immediately take appropriate action.

Assume that people will always follow procedures.

Rely on operators being well-trained, when it is not clear how the training provided relates to accident prevention or control.

Rely on training to effectively tackle slips/lapses.

State that operators are highly motivated and thus not prone to unintentional failures or deliberate violations.

Ignore the human component completely and failing to discuss human performance at all in risk assessments.

Inappropriately apply techniques, such as detailing every task on site and therefore losing sight of targeting resources where they will be most effective.

In quantitative risk assessment, provide precise probabilities of human failure (usually indicating very low chance of failure) without documenting assumptions/data sources.

Companies should consider whether any of the above apply to how their organisation manages human factor.

Make specific requests. Clarify your expectations and requests at every opportunity by rephrasing those conversational talks into directives. Instead of saying, "We should have the new Web page up by next week, yes?" try, "I expect you to finish the new Web page by Monday, with these items completed, and give it to me for final approval so we can go live Wednesday."

Record your expectations. While clear requests are a good start, you can still fall into "I said, you said" possibility if the task was interpreted differently than

you intended. Instead of searching through audio recordings, meeting notes or emails, use project management software to assign and track tasks and deliverables like Basecamp or Asana. This system of assigning allows the employees to ask clarifying questions, give updates and request more information. It also gives you a point of reference if there is a problem later on.

Address problems with facts. Even the best systems and clearest explanations are apt to some level of error, whether intentional or due to neglect. By focusing on facts instead of feelings, you can keep some of the emotional charge out of the conversation. There should be no ambiguity around the action and results expected when you refer back to the initial request and show documentation. Furthermore, by seeing that the request was clear, the expectation was documented and instruction was given, you'll be able to more accurately identify the team members who are genuinely incompetent from those who simply misunderstood you.

Characteristics Failure Type

Examples Typical Control MeasuresA

cti

on

Err

ors

Associated with familiar tasks that require little conscious attention. These 'skill-based' errors occur if attention is diverted, even momentarily.Resulting action is not intended: “not doing what you meant to do”.Common during maintenance and repair activities.

Slip(Commissio

n)

A simple, frequently-performed physical action goes wrong:

flash headlights instead of operating windscreen wash/wipe function

move a switch up rather than down (wrong action on right object)

take reading from wrong instrument (right action on wrong object)

transpose digits during data input into a process control interface

human-centred design (consistency e.g. up always means off; intuitive layout of controls and instrumentation; level of automation etc.)

checklists and reminders; procedures with 'place markers' (tick off each step)

independent cross-check of critical tasks (PTW)

removal of distractions and interruptions

sufficient time available to complete task

warnings and alarms to help detect errorsoften made by experienced, highly-trained, well- motivated staff: additional training not valid

Lapse(Omission)

Short-term memory lapse; omit to perform a required action:

forget to indicate at a road junction

medical implement left in patient after surgery

miss crucial step, or lose place, in a safety-critical procedure

drive road tanker off before delivery complete (hose still connected)

Err

or

Th

inkin

gDecision-making failures; errors of judgement (involve mental processes linked to planning; info. gathering; communication etc.)Action is carried out, as planned, using conscious thought processes, but wrong course of action is taken: 'do the wrongthing believing it to be right'

Rule BasedMistake

If behaviour is based on remembered rules and procedures, mistake occurs due to miss-application of a good rule or application of a bad rule:

misjudge overtaking manoeuvre in unfamiliar, under-powered car

assume £20 fuel will last a week but fail to account for rising prices

ignore alarm in real emergency, following history of spurious alarms

plan for all relevant 'what ifs' (procedures for upset, abnormal and emergency scenarios)

regular drills/exercises for upsets/emergencies

clear overview / mental model (clear displays; system feedback; effective shift handover etc.)

diagnostic tools and decision-making aids (flow- charts; schematics; job-aids etc.)

competence (knowledge and understanding of system; training in decision-making techniques)

organisational learning (capture and shareexperience of unusual events)

Knowledge Based

Mistake

Individual has no rules or routines available to handle an unusual situation: resorts to first principles and experience to solve problem:

rely on out-of-date map to plan unfamiliar route

Misdiagnose process upset and take inappropriate corrective action (due to lack of experience or insufficient / incorrect information etc.)

Non

-com

pli

an

ce

Deliberate deviations from rules, procedures, regulations etc. Alsoknown as 'violations'Knowingly take short cuts, or fail to follow procedures, to save time or effort.Usually well-meaning, but

Routine

Non-compliance becomes the 'norm'; general consensus that rules no longer apply; characterised by a lack of meaningful enforcement:

high proportion of motorists drive at 80mph on the motorway

PTWs routinely authorised without physical, on-plant checks

improve risk perception; promote understanding and raise awareness of 'whys' & consequences(e.g. warnings embedded within procedures)

increase likelihood of getting caught

effective supervision eliminate reasons to cut corners

(poor job

Situational

Non-compliance dictated by situation-specific factors (time pressure; workload; unsuitable tools & equipment; weather); non-compliance may be the

misguided (often exacerbated by unwitting encouragement from management for 'getting the job done’).

only solution to an impossible task: van driver has no option but to

speed to complete day's deliveries

design; inconvenient requirements;unnecessary rules; unrealistic workload and targets; unrealistic procedures; adverse environmental factors)

improve attitudes / organisational culture (active workforce involvement; encourage reporting of violations; make non-compliance 'socially' unacceptable).

Exceptional

Person attempts to solve problem in highly unusual circumstances (often if something has gone wrong); takes a calculated risk in breaking rules:

after a puncture, speed excessively to ensure not late for meeting

delay ESD during emergency to prevent loss of production