rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the...

17
Discussion Draft 1 The Case for a Business Process Engineering Approach to Managing Security and Resilience Of Lifeline Infrastructures and Regional Communities Jerry P. Brashear, PhD; Paula L. Scalingi, PhD; and Ryan M. Colker, JD 1 Summary For more than a decade, seven Congresses, two presiden- tial administrations and a plethora of agency documents have declared that risk management is essential to preparing and protecting our regional communities and the lifeline critical infrastructures on which they rely. Yet, very little true risk management has been undertaken beyond compliance with federal mandates. Rare is the infrastructure that uses risk manage- ment on a regular basis in its routine planning and budgeting processes, where it can create the greatest value – and rarer still is the regional coali- tion or partnership that does. Virtually none systematically and thoroughly examines risks posed by dependencies and interdependencies among the infrastructures, while virtually all recognize that failures in one could cas- cade through others across a major metro area, even a multi-state region. A variety of reasons contribute to this, including limited understanding and acceptance of risk management among local and regional public agencies and many infrastructures as well as the absence of a defensible risk man- agement process that includes dependencies and interdependencies among infrastructures. This white paper summarizes a project, sponsored by the U.S. De- partment of Homeland Security’s (DHS) Office of Infrastructure Protection (IP), to employ a business process engineering approach to the analysis- and-decision environment of infrastructures, local agencies and regional communities and to review available federally sponsored risk methods. The results of these examinations were synthesized into a set of design specifications for an integrated risk management process that encom- passes all phases of the risk/resilience management frame- work defined in the 2013 National Infrastructure Protection Plan (NIPP 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal Emergency Management Agency (FEMA) and the risk/resilience analysis national standard developed by the American Water Works Association (AWWA). A comprehensive, repeatable, defen- sible risk management process design is described as a model process that integrates sound risk/resilience decision-making among individual infrastructures and local agencies, regional public-private partnerships and the responsible state and federal agencies. A novel, organic, “bottom- up” implementation process, also based on business process engineering concepts, is advanced to overcome the conventional one-size-fits-all ap- proach that has been so poorly received and ineffectively applied by the intended beneficiaries. 2 The Challenge, Federal Response, and Current Project Headlines draw attention to the increased severity and frequency of natural disasters and the continuing threats of terrorist activities both at home and abroad. Natural events have increased significantly, escalating the losses in human casualties and property damage. In the U.S., more than 1,100 fatalities and economic damages of more than $188 billion occurred between 2011 and 2013, not even counting lost productivity or economic activity or the federal government’s $136 billion in federal response and recovery grants – just to get back to “normal” (Weiss and Weidman, 2015). Almost daily, terrorist at- tacks are reported – and these are only a fraction of all attempted. Significant portions of the human, material and economic losses from disasters occur because such events disrupt the delivery of vital services of lifeline critical infrastructures (CIs), including energy, water and waste water, transportation, and communications. For the present project, we add emer- gency services to this list. Without these CIs, communities can neither recov- er nor long survive. Any one infrastructure is interdependent with others, so the direct loss of one is exacerbated as an initial failure cascades to other infrastructures in a “chain reaction” that can spread losses widely throughout a region and beyond. Additionally, such infrastructures face long-term underinvestment in maintenance, rehabilitation and replacement even as population and demand for their services increase. This underinvestment has stretched existing infra- structures to meet higher demand by operating closer to their design maxima and keeping aging facilities in service well beyond their design lives, making them more vulnerable to whatever hazards may occur. Climate change may render the design and construction standards of past times obsolete, as greater and more frequent loads and new operational demands are placed on existing structures and systems. 2.1 Federal Response to the Challenge The most effective disaster management strategies are to withstand untoward events without unac- ceptable loss or service outage and to restore operations with minimal out- age or disruption if they cannot be withstood. This is the concept of resili- ence: to maintain service delivery regardless of events. While the true test of resilience is in the face of a potentially disastrous event, the ability to be resil- ient is facilitated through risk/resilience analysis, planning and performance assessment before the event. Such pre-event mitigation has been shown to have benefits of at least fourfold their costs (Multihazard Mitigation Council, 2005). Regional and community resilience relies upon the resilience of inter- dependent lifeline infrastructures and the emergency response and recovery functions of local governments. Consequently, the resilience of our nation is largely based on decisions made by local governments, local and regional infrastructure managers and their over- sight boards. The federal government has an in- herent interest in security and resilience and has issued a number of policies, plans, methods, tools and incentives to assist communities, lifeline and other critical infrastructures in security and resilience decision-making. 1 While these efforts are fundamental policies that have pointed the way to the achievement of resilient infrastructures and communities, their influence on investment and decision making has been limited. The NIPP 2013 calls for “Employ[ing] the THIRA process as a meth- od to integrate human, physical, and cyber elements of critical infrastructure risk” (NIPP 2013a, p. 22). THIRA is now being used widely by all state and local governments in 28 high-risk Urban Area Security Initiative (UASI) met- ropolitan regions to qualify for certain grants in aid. They are not currently used to screen or evaluate grant applications or to set grant amounts. The risk management framework defined by NIPP 2013 and its Supple- mental Tool is designed to assist analysts and decision-makers in identifying and understanding the most important risks an infrastructure or regional community faces, evaluating options for improving critical infrastructure secu- rity and resilience (CISR), and assessing their performance over time. Sev- eral efforts have been undertaken to operationalize NIPP 2013, of which the present project is one. 2.3 The Project Improving the information used in the key risk man- agement decisions is essential to improving CISR and lies at the core of NIPP 2013. The Department of Homeland Security (DHS) Office of Infra- structure Protection (IP) engaged the National Institute of Building Sciences 1 Presidential Policy Directives 8 (National Preparedness) and 21 (Critical Infrastructure Security and Resilience), along with the plans and systems that implement them, and Ex- ecutive Orders 13636 (Improving Critical Infrastructure Cybersecurity) and 13653 (Prepar- ing for Impacts of Climate Change) all emphasize the central role of risk analysis and man- agement by critical infrastructure systems, state and local governments and regional public- private partnerships (P3s) in advancing the national goal of critical infrastructure security and resilience (CISR). The Office of Infrastructure Protection Strategic Plan: 2012-2016: Collaborate (2012) and The National Infrastructure Protection Plan 2013: Partnering for Critical Infrastructure Security and Resilience (NIPP 2013; IP, 2013a), with its Supplemental Tool: Executing A Critical Infrastructure Risk Management Approach (IP, 2013b) lay out a risk analysis framework in some detail. Rare is the infrastructure that uses risk management on a regular basis in routine planning and budgeting, where it can create the greatest value – and rarer still is the re- gional coalition or partnership that does.

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

1

The Case for a Business Process Engineering Approach to Managing Security and Resilience Of Lifeline Infrastructures and Regional Communities

Jerry P. Brashear, PhD; Paula L. Scalingi, PhD; and Ryan M. Colker, JD

1 Summary For more than a decade, seven Congresses, two presiden-tial administrations and a plethora of agency documents have declared that risk management is essential to preparing and protecting our regional communities and the lifeline critical infrastructures on which they rely. Yet, very little true risk management has been undertaken beyond compliance with federal mandates. Rare is the infrastructure that uses risk manage-ment on a regular basis in its routine planning and budgeting processes, where it can create the greatest value – and rarer still is the regional coali-tion or partnership that does. Virtually none systematically and thoroughly examines risks posed by dependencies and interdependencies among the infrastructures, while virtually all recognize that failures in one could cas-cade through others across a major metro area, even a multi-state region. A variety of reasons contribute to this, including limited understanding and acceptance of risk management among local and regional public agencies and many infrastructures as well as the absence of a defensible risk man-agement process that includes dependencies and interdependencies among infrastructures.

This white paper summarizes a project, sponsored by the U.S. De-partment of Homeland Security’s (DHS) Office of Infrastructure Protection (IP), to employ a business process engineering approach to the analysis-and-decision environment of infrastructures, local agencies and regional communities and to review available federally sponsored risk methods. The results of these examinations were synthesized into a set of design specifications for an integrated risk management process that encom-passes all phases of the risk/resilience management frame-work defined in the 2013 National Infrastructure Protection Plan (NIPP 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal Emergency Management Agency (FEMA) and the risk/resilience analysis national standard developed by the American Water Works Association (AWWA). A comprehensive, repeatable, defen-sible risk management process design is described as a model process that integrates sound risk/resilience decision-making among individual infrastructures and local agencies, regional public-private partnerships and the responsible state and federal agencies. A novel, organic, “bottom-up” implementation process, also based on business process engineering concepts, is advanced to overcome the conventional one-size-fits-all ap-proach that has been so poorly received and ineffectively applied by the intended beneficiaries.

2 The Challenge, Federal Response, and Current Project Headlines draw attention to the increased severity and frequency of natural disasters and the continuing threats of terrorist activities both at home and abroad. Natural events have increased significantly, escalating the losses in human casualties and property damage. In the U.S., more than 1,100 fatalities and economic damages of more than $188 billion occurred between 2011 and 2013, not even counting lost productivity or economic activity or the federal government’s $136 billion in federal response and recovery grants – just to get back to “normal” (Weiss and Weidman, 2015). Almost daily, terrorist at-tacks are reported – and these are only a fraction of all attempted.

Significant portions of the human, material and economic losses from disasters occur because such events disrupt the delivery of vital services of lifeline critical infrastructures (CIs), including energy, water and waste water, transportation, and communications. For the present project, we add emer-gency services to this list. Without these CIs, communities can neither recov-er nor long survive. Any one infrastructure is interdependent with others, so the direct loss of one is exacerbated as an initial failure cascades to other infrastructures in a “chain reaction” that can spread losses widely throughout a region and beyond.

Additionally, such infrastructures face long-term underinvestment in maintenance, rehabilitation and replacement even as population and demand for their services increase. This underinvestment has stretched existing infra-structures to meet higher demand by operating closer to their design maxima and keeping aging facilities in service well beyond their design lives, making them more vulnerable to whatever hazards may occur. Climate change may render the design and construction standards of past times obsolete, as greater and more frequent loads and new operational demands are placed on existing structures and systems.

2.1 Federal Response to the Challenge The most effective disaster management strategies are to withstand untoward events without unac-ceptable loss or service outage and to restore operations with minimal out-age or disruption if they cannot be withstood. This is the concept of resili-ence: to maintain service delivery regardless of events. While the true test of resilience is in the face of a potentially disastrous event, the ability to be resil-ient is facilitated through risk/resilience analysis, planning and performance assessment before the event. Such pre-event mitigation has been shown to have benefits of at least fourfold their costs (Multihazard Mitigation Council, 2005). Regional and community resilience relies upon the resilience of inter-dependent lifeline infrastructures and the emergency response and recovery functions of local governments. Consequently, the resilience of our nation is largely based on decisions made by local governments, local and regional

infrastructure managers and their over-sight boards.

The federal government has an in-herent interest in security and resilience and has issued a number of policies, plans, methods, tools and incentives to

assist communities, lifeline and other critical infrastructures in security and resilience decision-making.1 While these efforts are fundamental policies that have pointed the way to the achievement of resilient infrastructures and communities, their influence on investment and decision making has been limited. The NIPP 2013 calls for “Employ[ing] the THIRA process as a meth-od to integrate human, physical, and cyber elements of critical infrastructure risk” (NIPP 2013a, p. 22). THIRA is now being used widely by all state and local governments in 28 high-risk Urban Area Security Initiative (UASI) met-ropolitan regions to qualify for certain grants in aid. They are not currently used to screen or evaluate grant applications or to set grant amounts.

The risk management framework defined by NIPP 2013 and its Supple-mental Tool is designed to assist analysts and decision-makers in identifying and understanding the most important risks an infrastructure or regional community faces, evaluating options for improving critical infrastructure secu-rity and resilience (CISR), and assessing their performance over time. Sev-eral efforts have been undertaken to operationalize NIPP 2013, of which the present project is one.

2.3 The Project Improving the information used in the key risk man-agement decisions is essential to improving CISR and lies at the core of NIPP 2013. The Department of Homeland Security (DHS) Office of Infra-structure Protection (IP) engaged the National Institute of Building Sciences

1 Presidential Policy Directives 8 (National Preparedness) and 21 (Critical Infrastructure Security and Resilience), along with the plans and systems that implement them, and Ex-ecutive Orders 13636 (Improving Critical Infrastructure Cybersecurity) and 13653 (Prepar-ing for Impacts of Climate Change) all emphasize the central role of risk analysis and man-agement by critical infrastructure systems, state and local governments and regional public-private partnerships (P3s) in advancing the national goal of critical infrastructure security and resilience (CISR). The Office of Infrastructure Protection Strategic Plan: 2012-2016: Collaborate (2012) and The National Infrastructure Protection Plan 2013: Partnering for Critical Infrastructure Security and Resilience (NIPP 2013; IP, 2013a), with its Supplemental Tool: Executing A Critical Infrastructure Risk Management Approach (IP, 2013b) lay out a risk analysis framework in some detail.

Rare is the infrastructure that uses risk management on a regular basis in routine planning and budgeting, where it can create the greatest value – and rarer still is the re-gional coalition or partnership that does.

Page 2: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

2

(Institute) to assist in operationalizing the NIPP 2013 risk/resilience man-agement framework in the form of a conventional business process. This CISR Risk Management Process (CISR-RMP) is intended to provide a work-able, scalable, repeatable, defensible, integrable and practical process that lifeline CIs, local governments (especially emergency management) and re-gional public-private partnerships or coalitions (P3s) can use collaboratively to rationalize the allocation of scarce and constrained resources for security and resilience. Such a process would be fully integrated with on-going, signif-icant business processes such as asset management, continuity planning, and capital development planning and budgeting to assure risk management becomes a standard, routine business practice and avoids duplicative data collection or evaluation processes.

The project proceeded through five stages: (1) interviews with managers of lifelines and local agencies to describe the current status of risk analysis and management; (2) summary review of Federally sponsored lifeline CI risk tools; (3) define a set of detailed design specifications for the CISR-RMP; (4) design a CISR-RMP that meets the specifications; and (5) lay out a roadmap to narrow remaining gaps, refine the process and implement it in a manner that will promote its integration and sustained, routine application.

3 Local CI and Regional Decision Context and Constraints To un-derstand the decision context of risk management at the local lifelines, emer-gency response agencies and regional P3s, it is necessary to understand the current processes in place, decision environment and constraints in which these parties operate. The project team conducted a number of semi-structured interviews with a non-random selection of actual decision-makers and analytical staffs in lifeline CIs, local agencies and regional public-private coalitions. These interviews provided a clear understanding of on-the-ground conditions through the eyes of potential users of a CISR-RMP.

3.1 Overall Results The range of capabilities and expertise that direct-ly focuses on physical and cyber risks associated with interdependent life-lines and regions is very broad. Some large and forward-thinking jurisdictions and utilities have adopted sophisticated risk management as standard oper-ating procedures – often using unique, proprietary or narrowly threat-specific risk analysis methods that cannot readily be transferred or integrated. Out-side of these, most lifelines and local jurisdictions have actually performed very little risk analysis that leads to significant decisions and no resilience analysis beyond continuity of operations/continuity of government planning. Most jurisdictions and lifeline operators have chosen to simply comply with federal and state requirements (often at a cursory level), or treat risk man-agement as a periodic exercise (e.g. five-year special event). Several stated that requirements from an external authoritative source (e.g., higher govern-ment, industry standards, or regulatory agency) can ease the allocation of the time and limited funds to risk analysis because it removes the need to justify the effort.

Scope of application and interest – In general, those interviewed re-ported that very little routine, systematic risk analysis occurs. Most often, an event (usually with an accompanying outage) demonstrates a need for re-medial investment. Many respondents do not currently use formal risk analy-sis at all. Of those who do, few are satisfied with current processes. A num-ber of utility respondents reported conducting a process simply to comply with requirements from higher authorities, rather than actually basing deci-sions on the results. Virtually all of them, however, use sophisticated busi-ness processes such as asset management; strategic, capital, continuity and operational planning; operating and capital budgeting; operations models; and performance appraisal that could readily contribute to and use the re-sults of risk analysis.

State and local officials and infrastructure operators increasingly recog-nize the need to better understand all-hazards impacts on interdependent CIs. These respondents expressed serious interest in using a simple, low- or no-cost, transparent and manageable process to prioritize and justify actions

and investments in security and resilience. Ideally, such a process would be routinely carried out by their staffs, perhaps with a minimal level of training and the availability of technical assistance as needed. Increasingly, their fo-cus is on pre-event prevention, protection and mitigation, as well as post-disaster collaborative response, recovery and restoration of critical assets and systems. Those organizations that are interested appreciate expert advi-sors for both process and substantive suggestions on risk assessment op-tions, but cost and time remain serious constraints.

Interdependencies – Virtually all the respondents were keen to better understand their risks due to disruptions of the CIs and suppliers they rely on. All were acutely aware of their dependencies and interdependencies, especially to power and water outages and fuel availability, and some have taken steps to reduce these vulnerabilities through internal solutions like back-up power and water and fuel storage.

However, in most areas, the relevant agencies, e.g., emergency man-agement, public health, public works and the respective utilities (whether publicly or privately owned), are “siloed” from one another, with little or no interaction, so interdependencies are virtually never analyzed beyond super-ficial levels. CI managers are acutely aware of the issue, but lack the tools and data to analyze it in depth. Many expressed reluctance to share highly sensitive information outside their own organizations. Tools can be devel-oped through research, development and deployment efforts with long-term support, while development of a detailed protocol for defining the minimum effective set of data, and establishing confidentiality safeguards and penal-ties for violations. The prospect of increasing their own ability to manage in-

terdependencies may offset the con-cerns about exchanging the minimum data necessary, under well-understood and enforced safeguards.

Major disasters like earthquakes and hurricanes affect most CIs and

local governments at the same time, yet few risk analyses consider the po-tential that most or all CIs in a region could be damaged by the same event. For example, under its first cross-agency risk analysis, one metropolitan area discovered that water/wastewater, electricity distribution and public works (roads) each had contingent debris-removal contracts with the same compa-ny – which had enough capacity to service any one of these at a time, but not enough to service them all at the same time. One lifeline respondent had inquired of several of his critical customers, including other CIs, hospitals and local agencies how long they expected a major outage to last. He found their estimates were 10% to 20% of what he projected as actual based on histori-cal data and projections.

Resilience – CIs and local agencies generally see resilience as synon-ymous with reliability, continuity or as an outcome of risk management rather than a goal in itself or something to be analyzed separately from risk. Most infrastructure respondents were sensitive to the essential role played by their services in the well-being of their communities – their “public trust.” Several public-sector owners spontaneously raised the issue of balancing risk reduc-tion for their own systems with maintaining or restoring service rapidly to the customers, a concrete example of the dual NIPP objectives of security and resilience. Several indicated that it is crucial to address the economic im-pacts of service disruption on their communities as well as risk to the utility as part of the risk analysis, “especially when there’s not enough return on investment to make the business case using only impacts to the utility,” as one local utility official said.

Across the nation, numerous utilities and service providers are incorpo-rating resilience into their own continuity planning and are beginning to join with other organizations and associations focusing on community and re-gional resilience.

Risk management integration with other business processes – Sev-eral CI owners suggested linking any new methods directly with on-going local processes such as asset management and/or economic and community development, and integrating them to increase the likelihood that the meth-ods would be sustained over time and potentially lead to savings in the costs of the analysis efforts and the resulting options. The water, electricity and

Virtually all respondents were keen to better understand their risks due to disruptions of the CIs and suppliers they rely on – but lack analytic tools and information sharing protocols to deal with them.

Page 3: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

3

highways subsectors have all taken up asset management mainly to address the risks of seriously aging assets, but also extending to include all hazards, including financial ones – in other words, full enterprise risk management. Many respondents mentioned the need to find a way to measure security (risk) and resilience (e.g., expected outage) in ways that can be reported to and understood by rate-setting boards, local governments, customers, the general public and state and national agencies, especially those that provide grants.

Comparability – Most CI operators had not thought about whether risk and resilience tools should be comparable across sectors, but those who had thought about it expressed the view that comparability would have many ad-vantages, including in conducting interdependencies analyses and, especial-ly, in better educating elected officials and their budget staffs, rate-setting bodies and the general public. Especially with larger investments in long-term security and resilience, selling risk reduction and resilience enhancements to these groups is necessary for the investments to be made.

Climate Change – Many respondents expressed significant concern about the locally pressing aspects of climate change. Along both coasts and the Gulf Coast, the concern is coastal storm surge associated with increas-ingly intense storms due to sea-level rise. Some are concerned about sea level risk relative to low-lying topography and subsidence. In the Midwest and South, the issues are severe ice storms and snow in winter, leading to major flooding with spring snow melt, and tornadoes and derechos in summer. Much of the West is experiencing extreme drought and rampant wildfires. Virtually all of them are seeking solutions, but the idea of formal risk analysis and option valuation is seldom seen as part of that search.

Expectations of state/federal support – One reason for the limited use of risk analysis tools is the widely held belief among local agencies and pub-licly owned utilities that if disas-ter strikes, the federal or state governments will step in to pay for the majority of the costs of recovery and restoration, thus discounting the value of invest-ments in prevention, protection or pre-event mitigation. One respondent went so far as to say, “Investing 100-cent dollars of local taxpayer or ratepayer money before a highly uncer-tain future event seems irrational compared to paying 25-cent dollars of local taxes [the typical local share, with 75% from the federal government] after the event has become a certainty, if and when it ever does.” Many federal employees associated with existing tools shared a similar view. Several re-spondents stressed the importance of continuity in risk analysis methods and results in educating elected and appointed budget- and rate-setting boards to the basic concepts of risk management. Clearly, the business case must be made for using risk analysis and investing in risk mitigation.

Liabilities – A near universal issue, especially in the private sector, is fear of legal liability and negligence suits associated with conducting risk analyses and then experiencing casualties or damages due to a known risk that was determined to be too low priority to justify investment. Another issue is the costs associated with identifying risk that requires substantial invest-ment to mitigate, but little or no incremental revenue or routine cost savings.

Beyond these generalizations, the respective lifelines exhibited subtle differences and distinctive features from one another:

3.2 Water/Wastewater. The water sector is a partial exception to the finding that little risk analysis is actually being performed by local CIs. The Bioterrorism Act of 2002 required all drinking water systems serving more than 3,300 people to conduct vulnerability or risk analyses and submit their results to the U.S. Environmental Protection Agency (EPA). In many utilities, this experience established an appreciation that risk analysis helped make the case for needed investments in security and reliability. The American Water Works Association (AWWA) adapted the water/wastewater method developed by the American Society of Mechanical Engineers (ASME) under DHS/IP sponsorship (ASME, 2007b) into an American National Standard, ANSI/AWWA J100-10: Risk and Resilience Management of Water and Wastewater Systems (AWWA, 2010). Released in 2010, J100-10 has sold

several hundred copies, and DHS has designated the standard under the Safety Act, providing specific liability relief to its users. Many larger and mid-sized water and wastewater utilities have used or are currently using J100-10; others have developed their own risk tools incorporating elements of it. At least three software systems have been developed and are available to the public. Most of the major water system engineering firms offer a service based on the standard, often with their own proprietary software packages.

3.3 Transportation The transportation sector is also experiencing in-creased interest in risk management. The Moving Ahead for Progress in the 21st Century Act (MAP-21) (P.L. 112-141, signed July 6, 2012) set perfor-mance standards and requires a “risk-based asset management plan” that includes capital asset inventories with condition assessments, target im-provements relative to performance measures, formal investment prioritiza-tion processes (based on risk-reduction and life cycle costs) and progress reporting for highways (including bridges and tunnels) and transit systems. States are encouraged to include all infrastructure assets within highway rights-of-way. Sharing rights-of-way with water, telecommunications fiber optics lines and energy distribution systems is quite common, minimizing eminent domain issues, but augmenting interdependencies and proximity risks. The rule-making process to implement these requirements is currently on going. States will be required to conduct “risk management analysis” to assets relative to threats posed by “current and future environmental condi-tions, including extreme weather events, climate change, and seismic activi-ty” in the words of the rule-making summary; a ten-year financial plan; in-vestment strategies to improve or preserve assets; and an on-going system for measuring and managing the condition of roads and bridges. At least one state department of transportation (Colorado) has initiated a project to adapt the J100-10 method to this task.

3.4 Energy The North American Electric Reliability Corporation (NERC) is focused on raising and maintaining bulk power reliability, i.e., continuity of service at defined quality lev-els by the major transmission

grids. The overall method is to establish mandatory standards and monitor compliance. NERC has developed a Critical Infrastructure Protection meth-od, NERC CIP, a conditional risk approach (i.e., one that assumes threat likelihood of 1.0, or certainty) designed for compliance, now in its fifth edition. Nuclear power plants are subject to regular and continuing probabilistic risk analysis for a variety of hazards, mostly those that would cause a release of radioactive material or lead to a major meltdown. As part of the earlier ASME tool development, all U.S. nuclear plants completed a terrorism risk analysis. Other power plants and distribution systems typically have robust physical security programs covering both physical and cyber security. Many routinely exercise the detailed models used to plan and/or control their systems’ oper-ations to identify ways of managing the loss of various assets. “N minus one” analyses, a simulation of how the systems would adapt to sustain service if major assets were out of service one at a time, are routine in many power distribution systems. While such exercises directly address routine resilience, the project team did not find standardized all-hazards risk analysis among these organizations. The IEEE Power and Energy Society very recently rec-ommended upgrading and integrating analysis for security and resilience with asset management for a holistic approach to all hazards, including wear and aging (Novosel et al., 2014).

3.5 Telecommunications Telecommunications providers are less for-mal in their approach to risk. They rely on their design engineers and maintenance personnel to identify potentially vulnerable situations involving their primary assets and perform limited, informal benefit/cost analysis to justify investments in risk reduction and resilience enhancement. They rely on “industry best practice standards,” internal company standards and histor-ical experience with equipment failures to identify areas of concern. Tele-communications depend heavily on electricity to operate, so they make ex-tensive use of batteries and emergency generators at their sites to assure reliable function during power outages. One telecommunications executive

“Investing 100-cent dollars of taxpayer money before a highly un-certain future event seems irrational compared to paying 25-cent dollars of local taxes after the event has become a certainty, if and when it ever does.” – One respondent, the view of several local and federal respondents

Page 4: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

4

predicted any federal initiative to implement risk analysis requirements would be strongly resisted as “sounding like regulation,” but expressed that a sound, voluntary framework advanced through a partnership with state and local governments and other private entities would be more favorably re-ceived, especially if it provided sharing of information useful to their decision-makers, e.g., about interdependencies.

3.6 Emergency response In emergency management of all states and UASI regions, THIRAs are required for grant eligibility, but are not used by FEMA in setting grant amounts or specific allocations. THIRA is nominally comprehensive, covering all five preparedness mission areas – prevention, protection, mitigation, response and recovery – for all hazards. So far, how-ever, it is being used for only 13 of 31 core capabilities, all in response and early recovery, according to FEMA guidance.

The local officials interviewed believed that THIRA is almost exclusively executed by state and local emergency managers, with the vast majority of grant funds very definitely expected to go to police, fire, rescue and emer-gency management, with little or none directed toward lifeline CIs or other functions. This makes emergency management almost as stove-piped as the CIs. Infrastructures were consulted mostly in areas having to do with first-responders’ capabilities, e.g., water for fire suppression, electricity for shel-ters and mass care facilities. THIRA has not yet been adapted for or adopted by lifeline infrastructures, despite the NIPP 2013 “call for action” to the con-trary. At least one major metropolitan area has begun to train CI personnel in the use of THIRA.

One typical respondent called THIRA a “good concept, but a pain… a necessary evil,” and suggested it be made “less bureaucratic,” yet provide more concrete definitional and procedural guidance for those using it, espe-cially in valuing and selecting among competing options. The current, broadly defined directions were seen more as cause for anxiety about whether users were applying it correctly than as the flexibility envisioned by its authors. Other users made similar comments, seeing THIRA as “very basic,” and almost always used as a means to comply with requirements for grant eligibility rather than in broader risk management. In the few places where THIRA is used for decision sup-port, respondents said they use it to identify the most severe consequences (especially human casualties) and to rank response capability-building ac-tions based on them. Vulnerabilities and, especially, threat likelihood play a much smaller role, if any, than consequences in resource allocation. The direct threat-capability linkage follows a traditional emergency management approach, so it feels natural to those using it.

Emergency managers suggested a number of improvements to THIRA, including development of a simple, but explicit common methodology to help delineate and estimate vulnerabilities and consequences and sort out options to justify selections and flexibility in choices (as opposed to “mandates”), coupled with more information about what capabilities and best practices others are using successfully. Several users expressed concern that ignoring threat likelihood encourages misallocations of attention and resources.

In addition to THIRA, respondents noted other extensive, federally spon-sored programs and tools that address vulnerability- and risk-related issues, including vulnerability analyses or surveys conducted by Federal Protective Service Advisors (PSAs) and Transportation Security Administration (TSA) field personnel. Several emergency managers reported that in the words of one, these “are a mixed bag.” Some offer substantial help and insight, but others less so, being intrusive, time consuming and overly prescriptive as to countermeasures that communities should implement. None had specifically received risk analysis assistance from PSAs, and several were skeptical that the surveys offered were effective in understanding risk or deciding what to do about it. Several respondents expressed the observation summarized by one, “DHS is about checking the boxes, not information sharing or problem-solving.”

In summary, lifeline CIs and emergency responders seldom conduct risk/resilience analysis to allocate resources to options to enhance CISR, but

they are generally amenable to a competent process that provides substan-tial near-term value to them (e.g., grant eligibility, aid in selecting and defend-ing options, and concrete insight into their vulnerabilities to interdependen-cies), that is low-cost or no-cost, and simple enough for their staffs to perform and explain to management and oversight agencies.

4 Federal Tool Screening Given the challenges and desired capabilities for a CISR-RMP, the project team undertook an effort to identify candidate tools for possible use or adaptation. The main criterion for effective tools is the ability to maximize risk reduction and resilience under constraints includ-ing the ability to make key decisions without distortions introduced by the analytic method used. Any method that could materially distort this decision-making likely results in sub-optimal, inefficient and irrational choices and thus is indefensible.

The team only considered federally sponsored tools because they can be acquired and modified by the federal government, whereas privately de-veloped tools entail additional costs, proprietary rights and other issues. The Institute’s project team met with federal agencies with responsibility for life-line infrastructures and development of related tools and guidance. Altogeth-er, the team identified and reviewed 21 tools.

4.1 Results of the Screening Of the 21 tools, only 11 warranted in-depth examination after a preliminary screening based on the criteria identi-fied above. Ten tools were screened out for the following reasons: Three estimate important elements in risk analysis, e.g., economic con-

sequences or future weather, but do not actually estimate risk or bene-fits. Such tools can materially contribute to risk analyses, but only to complement a true risk method.2

Seven more tools were detailed surveys that produce index scores that benchmark an organization against others. Although these tools can

identify areas of potential con-cern and suggest options for improving security and/or resil-ience, they do not measure risks, expected outages, con-sequences, or mitigation bene-

fits – information necessary for cost-effective resource allocation deci-sions. These tools were not further assessed for the project.3 The remaining eleven federally sponsored risk tools for lifeline infrastruc-

tures, shown in Table 1, purport to estimate risk and support risk-mitigation decisions. All include the premise that risk is a function, usually the product, of threat likelihood, asset vulnerability and various consequences, i.e., Risk (R) = f(Threat Likelihood (T), Vulnerability (V), Consequences (C)), usually (R = T × V × C) although T is often assumed away as 1.0. Other specifics vary widely. None captures full uncertainty or correlations among the varia-bles fully. The tools were evaluated relative to their ability to support key de-cisions without distortions. Key Decisions (Column A) are the minimum set of decisions required for rationally managing resources to achieve the greatest net benefit to both the infrastructure owners and the communities they serve. These decisions require specific Process Outputs (Column B), which, in turn require systematic, repeatable, defensible estimation of the listed Constituent Terms (Column C).

Logical consistency of process (not necessarily identical processes) and directly comparable results are crucial for allocating resources across divi-sions of large, diverse corporations or governments, for analyzing interde-pendencies among CIs, and for aggregating to organizational totals at re-gional and higher levels for accountability and governance. In addition to the logical consistency of the analytical processes, comparability requires a standardization of the initial set of threat/hazard scenarios (Column D). To

2CMIP Climate Data Processing Tool (DOT/FHWA), Hydraulic Engineering Circular Vol. 25 (DOT/FHWA), and Water Health and Economic Analysis Tool (WHEAT, USEPA)3 Baseline Assessment for Security Enhancement (BASE) for Mass Transit (DHS/TSA), BASE for Highway Vehicles (DHS/TSA), Infrastructure Survey Tool (IST, DHS/IP), Modified IST (DHS/FPS), NIST Cyber Security Framework (DOC/NIST), NIST Infrastructure Com-munity Resilience Framework (DOC/NIST), and Pipeline Corporate Security Review (DHS/TSA)

Ordinal scale tools necessarily have open-ended “greater than” cate-gories for consequences and “less than” for threat likelihoods, both of which may vary over hundreds, thousands, millions, even billions of times, so ignoring them can seriously distort decision-making.

Page 5: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

5

support aggregation, these scenarios must be mutually exclusive and collec-tively exhaustive, including an assumed “no other negative event” scenario (including all positive conditions, usually met by assumption). The “design objectives” row (in red) characterizes the desired process

The five tools shown in the lower portion of the table estimate elements of the risk equation but rely on ordinal scales of measurement (e.g., low-medium-high-very high; green-yellow-red; 1-to-5 or 1-to-10 scales or even finer gradations). Ordinal scales do not have equal intervals (the distance between adjacent levels may be unequal) and necessarily have open ended “greater than” categories for consequences and “less than” for threat likeli-hoods, both of which may vary over hundreds, thousands, millions, even bil-lions of times. Ordinary arithmetic functions cannot be used with such scales – although many advocates have tried – so they can seriously distort re-source allocation decision-making. These limitations make estimating risk levels and benefits of options mathematically impossible, so these tools can-not support rational benefit analysis or resource allocation. Use of such scales, however, provides evidence of risk-oriented thinking among their us-ers. Such tools might be able to be evolved into effective risk methods by changing the scales used to ratio scales.

The remaining six tools, shown at the top of the table, estimate the terms of the risk equation using ratio scales – equal distances between numbers and a true zero (absence of the quantity) – e.g., things that can be counted. However, five of the six use conditional risk for all included hazards (assum-ing the likelihood of unwanted events to be 1.0, or certainty). This practice unavoidably distorts key decisions, because the likelihood of a terrorist attack on a specific asset or subsystem in a given location is generally several or-ders of magnitude smaller than the likelihood of other hazards such as weather events. Further, conditional risks cannot be meaningfully compared

or aggregated. Any of these six tools could readily be upgraded to demon-strate full ratio risk by incorporating the missing terrorism threat likelihood.

The exception to using conditional risk is provided by the standard ANSI/AWWA J100-10. The standard is currently updating it to be released as ANSI/AWWA J100-15. While the 2010 version permitted use of ordinal scales (in the form of pre-set ranges, or “bins”) and conditional risk, the 2015 version drops such allowances due to the shortcomings just described. Both versions provide a “proxy” method for approximating terrorist threat based on the notion of the terrorist selecting a target and attack mode. It is referred to as the “proxy” method because it stands in lieu of a true likelihood estimate. The proxy method is a placeholder until an authoritative threat likelihood measure is available. The method adapts a study by the RAND Corporation and Risk Management Solutions Inc., (Willis, et al., 2007) and local condi-tions of actual terrorist attacks to estimate likelihood.

J100-10 has been applied to more than 100 water and wastewater sys-tems, including some of the nation’s largest, such as Chicago; the National Capital Region (three systems); Richmond, Virginia; Long Beach, California; and Minneapolis. Metropolitan electricity and highway systems, emergency communications and dispatch, fire suppression, emergency medical service and police emergency operations have also used it successfully (Brashear, et al., 2011). The six ratio-scale tools use roughly comparable concepts and definitions of conditional risk, vulnerability, and consequences. Five of the six tools measure risk from the perspective of critical infrastructure owners, as opposed to the public; J100-10 does both; and THIRA considers community-level impacts, many of which are public. Three of the tools apply only to ter-rorist or malevolent threats, one deals only with natural hazards associated with climate change. The remaining two tools – THIRA and J100 (both edi-tions) – use an all-hazards approach. The similarities between THIRA and J100 are sufficient to conclude that either could be converted to a common

Table1:SummaryReviewofFederallySponsoredRiskMethods&ToolsforLifelineInfrastructures

Page 6: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

6

approach (perhaps with tailored versions to specifically apply to specific sec-tors) or made comparable enough to analyze regional risk and resilience of interdependent lifelines and other critical infrastructures and to support ag-gregation to organizational, jurisdictional, regional, state and national levels.

The last column of the table summarizes each tool’s maturity level based on the 1 – 5 scale “maturity model” used by the U.S. Department of Defense and other agencies, including elements of the Department of Homeland Se-curity. The scale ranges from (1) ad hoc, beginning, undocumented; through (2) repeatable; (3) defined enough to be a standard business process; (4) managed through quantitative metrics; to (5) optimizing choices and self-improvement. None of the tools relying on conditional risk can reach level 5 because conditional risk cannot be used to calculate benefits. By defining and using a crude approximation of terrorist threat likelihood, J100 (both ver-sions) can support constrained optimization, but lacks full, cross-infrastructure collaborative treatment of interdependencies, so it was as-signed a 4.5

4.2 Reasons for This Situation and Improvement Suggestions Be-cause federal agencies sponsored all of these tools, they reflect federal con-cerns and focus on lifeline services predominantly provided by local public agencies, specifically water/wastewater, dams and highways. In sectors pre-dominantly operated by the private sector, such as energy and telecommuni-cations, the project team found no comparable, widely used tools. Most com-panies are well aware of the threats and hazards they face and use diverse self-generated or proprietary tools ap-plied by in-house or consulting experts. Exploring possible comparability or sharing of tools and/or information with other lifelines will likely require a location-specific approach.

Some of the reasons for these findings were articulated in discussions with the federal personnel who presented these tools and made suggestions that are useful in designing a CISR-RMP. Many were very similar to com-ments made by CI, state and local government personnel. They include:

Differing perceptions of risk – Even among those tools that use risk concepts, there are at least three basic methodologies: (1) Emergency re-sponders tend to focus on vulnerabilities that could cause fatalities and seri-ous injuries and, perhaps secondarily, major property losses as things to be addressed through robust preparation for the “worst-of-the-worst” operational eventualities. Because life is precious, conditional risk makes sense to them. (2) Those that take the engineering/micro-economics/ business perspective focus on allocating resources to maximize net benefits of reduced casualty liability and financial losses, benefit/cost ratios or return on investment, so full ratio risk from the perspective of the CI owner makes sense to them. (3) Those trained in public policy and welfare economics see the objective as constrained net benefit maximization (human and economic), but focus on the benefits and costs from the perspective of the public. Those holding these three perspectives seldom agree because they all believe theirs is uniquely correct. All three perspectives are reasonably legitimate in their re-spective domains: emergency response, enterprise management, and public policy and programming. A CISR-RMP should acknowledge this legitimacy and incorporate – but differentiate – all three where appropriate. AWWA J100-10 recognizes the distinction between the owners’ and the public’s per-spectives by estimating both.

Unclear roles and responsibilities – While it is generally assumed that the CI owner has first responsibility for CISR investments, it is also clear that, in many cases, the CI owner must forego potential public benefits because the owner and/or the regulator judges the level of benefits captured directly by the owner to be insufficient to justify the investment. Yet, neither the public nor local jurisdictions are aware of these decisions even though they may be profoundly impacted. Liability laws, corporate confidentiality and rate-justification requirements all act to limit the ability of CI owners to engage the

public to collaborate financially in making these investments. Measuring risks and benefits to the public as well as to the owner, and making adjustments in institutional and legal issues, could address this impediment. For this reason, again AWWA J100-10 estimates both.

Local expectations of federal remediation – Echoing state and local emergency managers, several federal employees candidly expressed the belief that if a major event should devastate a specific locality, the federal government will make it whole, at least 75% whole, the usual federal cost share. Under this belief, local officials may regard risk/resilience manage-ment as optional or not important; tolerated if tied to grants; but not a signifi-cant decision-driver relative to competing claims on their scare resources. In times of limited budgets and increasing numbers of increasingly serious events, the rapid and continuing escalation of federal outlays for disaster relief have caused numerous observers, including the Office of Management and Budget (OMB) and the U.S. Government Accountability Office (GAO), to note the need to reverse that trend for budgetary reasons. Local and regional risk/resilience analysis must be central to that effort, using a collaborative basis that allows both CIs and regional communities to make reasonable trade-offs and take financial responsibility for them.

Limited local expertise – Few local CIs or jurisdictions employ risk ex-perts. Most rely on outside consultants, who are often free to use processes and tools of their own design, for better or worse, thereby adding to the diffi-

culty in comparing re-sults. Hubbard (2009, pp. 68-77) attributes the popularity of ordi-nal risk and index tools over ratio-scale tools to the proselytizing of management consult-ants. When voluntary

federal tools are supported by active user training, technical assistance and quality assurance (TTA&QA), they have been more widely accepted and used. This demonstrates that when such expertise is available at little or no cost, it contributes to the acceptance and proper use of the tools.

Organizational silos – At the local and regional levels, the dependen-cies and interdependencies of CIs are among the most important threats to operational continuity and resilience. Risk tools advanced by federal agen-cies, each to its local counterpart, result in a variety of tools that cannot be used to compare risks or to support collaboration to manage interdependen-cy risk. The development of common, consistent CISR processes (but not necessarily common tools) that all lifeline CIs and local jurisdictions (as well as other CIs and organizations in the community) can use, along with appro-priate information-sharing protections, could allow reasonable collaboration and integration in both analysis and in equitably investing in solutions.

Lack of terrorism likelihood data – Local jurisdictions and operating units of lifeline infrastructures seldom have the ability to obtain information on the likelihood and nature of terrorist attacks, with the rare exceptions of im-minent danger. Many of the lifelines’ CISR decisions pertain to massive, long-term capital investments in durable assets and systems that may last 30, 40, 50 or more years. These decisions will become more frequent and involve greater sums as adaptations to climate change begin to be ad-dressed more widely. Conditional risk simply cannot support such decisions. Several federal agencies cited the lack of terrorism likelihood information as the principle driver of their encouragement of the use of conditional risk. In-formation sharing between the intelligence community and the CI community could address this. The U.S. Coast Guard is able to include terrorist threat likelihood in its Maritime Security Risk Analysis Model (MSRAM), an ordinal risk method, by securing the cooperation of the intelligence community, of which it is part. An office of DHS could be assigned the responsibility of in-termediating with intelligence agencies to translate their qualitative infor-mation to pragmatic quantitative direction for state and local agencies’ use. Rough order-of-magnitude precision is all that is required, but it should be differentiated by location, target type, attack mode and any other revealed adversary preferences and capabilities. Users recognize that conditional risk

Five of the six ratio-scale tools assume all threat likelihood to be 1.0, or certainty, as necessitated by the lack of malevolent threat likelihood data, so they unavoidably distort decisions because the likelihood of a terrorist attack on a specific asset is gen-erally hundreds to billions of times smaller than the likelihood of other hazards in an all-hazards analysis, e.g., weather events. Without threat likelihoods, such results cannot meaningfully be used to value benefits, be compared or aggregated.

Page 7: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

7

seems to lead to “wrong” solutions, so they seldom consider it beyond its use for compliance.

THIRA, for example, is a conditional ratio risk method, assuming adver-sary threat likelihood to be unity because of the absence of authoritative es-timates. For “consistency,” natural hazard frequency is also assumed to be unity. In a world of limited resources, this practice could cause significant distortion in resource allocation. AWWA J100-10, by contrast, avoids this distortion by using its rough “proxy” method, a temporary “fix” awaiting a bet-ter, federally led alternative. NIPP 2013 calls for THIRA to be “employed” for CIs, but because it uses conditional risk, it cannot support major decisions relative to the rational standard, especially for long-duration capital decisions. THIRA should be refined in the direction of J100-15 – even if temporarily us-ing some version of the J100 proxy – if it is to be effective for CIs and used in operationalizing NIPP 2013 (more about this later).

Lack of continuity and methodological maturation – DHS and other federal agencies have initiated development of several risk management processes that existed for a relatively short time, received limited testing, then were discontinued for reasons that are seldom explained. Examples include the RAMCAP se-ries (which, in industry hands has become one of the most advanced of the originally federally spon-sored tools, AWWA J100-10); Voluntary Chemical Assessment Tool (VCAT) (which was decommis-sioned in favor of a survey/index approach); the DHS Science & Technology (S&T) sponsored feasibility pilot test of Regional Resilience/Security Analysis Process (dropped due to S&T budget cuts); and several others. The termi-nating events seem to be associated with changes in administrations, changes in senior personnel, frustration among users due to limited exper-tise, lack of organizational processes by the sponsors to provide technical assistance or to facilitate local coordination and collaboration; etc. The result is that few full risk analysis methods have matured to the point of effective-ness and self-perpetuation. A process of iterative improvement, through an open-source process, would allow processes and tools to accumulate expe-rience and mature and improve over time. The project team suggests shield-ing CISR-RMP development from premature termination contingencies by organizing the federal effort for development and implementation in a non-federal center, along with long-term, multi-agency funding and governance that reflects the federal sponsors, end users from lifeline and other CIs and local and state governments, and recognized risk experts (both academic and practitioners).

Cybersecurity may always be standards-driven – The mere facts of huge numbers of uncounted daily attacks (of all kinds and purposes) on ju-risdictions’ and CIs’ cyber systems and the complexity and rapid evolution of these CI systems makes full risk management under the R = T × V × C con-cept problematic for cybersecurity. This is especially true for “zero day” threats of exploiting vulnerabilities that have yet to be identified. If one cannot define the specific nature of the threat or the system’s vulnerability to it, it is difficult to see how conventional risk processes apply. The contemporary convention of best-practice, standards-based guidance (e.g., the National Institute of Standards and Technology [NIST] Cybersecurity Framework, 2014) may be the best risk management process currently available for cy-bersecurity. Risk management processes can be fruitfully applied, however, to the threats of control-system failures of various durations on the physical operations of systems; and risk mitigation options, such as manual controls or back-up automation, may be feasible and desirable.

4.3 Conclusions. None of the tools examined meets all the design ob-jectives, but several are sufficiently similar that they could be adapted to a level of consistency adequate for comparison, interdependencies analysis, aggregation and rational decision support at various levels. THIRA and AWWA J100 and possibly others would be the most promising candidates because of their existing widespread base of use. The federal government will need to provide an authoritative means of estimating malevolent threat likelihood for any of the methods to be fully effective. Reasonable “proxy” approaches in the meantime facilitate better decisions than conditional risk approaches and allow repeatability. Interdependencies analysis requires cross-organization information sharing, enabled by public-private collabora-tion and a clear information sharing/protection protocol. Innovative implemen-tation of fully defensible, repeatable methods may allow more complete inte-gration with on-going business processes, e.g., asset management, continui-ty planning and capital improvement planning. Finally, any new or synthe-sized approach should receive time and resource commitments to allow the process to mature through accumulating field experience and systematic

reviews.

5 The CISR-RMP Design From this review of users’ decision needs and extant federally sponsored tools, the Institute team developed

an integrated set of design specifications to guide the design of a CISR-RMP that can be used by diverse organizations and constituencies at multiple in-teracting levels of organization necessary for addressing regional risk. The objective was not a tool, per se, but a model process that can be implement-ed in a variety of ways and still provide results that can be used in compari-sons, interdependencies analysis, decisions and aggregations.

The CISR-RMP design balances two conflicting purposes: the “ideal” – to make the process fully effective in allocating resources for the greatest benefit, using the state-of-the-art risk management – and the “pragmatic” – to make the process simple enough to be applied, understood, integrated into existing management processes and used routinely by staffs and manage-ment of CIs, local governments and regional coalitions. The project team believes it has achieved this balance by adopting common DHS definitions of risk and resilience; relying on a threat-asset scenario approach, with point estimates of key risk terms (for the present); defining cross-CI interdepend-ency analysis; and using constrained net-benefit maximization as the re-source-allocation decision criterion. Certain desirable features of a state-of-the-art risk management process (e.g., full uncertainty and correlations cap-ture with Monte Carlo simulations, real-options, portfolio optimization, etc.), while inherent in any contemporary ideal design, were seen as too complex for the present, so are deferred for a time when user sophistication calls for them.

5.1 The NIPP’s Five Phases of CISR-RMP The CISR-RMP operation-alizes the NIPP 2013 risk framework by enabling the five key sets of actions based on collaborative decision-making involving all three stakeholder con-stituency levels identified later: 1. Set Goals and Objectives – devolve the national goals and priorities into

local goals and objectives. For this project’s purposes, this step also in-cludes the specification of the threats and hazards of greatest concern to these decision-makers, i.e. what “keeps them up at night?”

2. Identify Infrastructure – define criticality for local analysts and decision-makers to use to focus on the most important systems, subsystems and assets relative to their respective organizational missions. The combina-tions of threats/hazards and critical assets, subsystems or systems (here-after called threat-asset pairs) define the set of scenarios for the analysis.

The CISR-RMP objective is not a tool, per se, but a model process that can be implemented by a variety of tools and adaptations of existing processes and still provide results that can be used in comparisons, interdependencies analysis, options valuation, aggregations and major resource decisions.

Page 8: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

8

3. Assess and Analyze Risks – estimate for each threat-asset pair the threat likelihood, vulner-ability and consequences (including possible out-ages), both current and as anticipated in the fu-ture, and combine them into a “no change” base-line risk for each threat-asset pair, often called the “cost of inaction.” Long-standing DHS practice defines risk as a function of threat likeli-hood, vulnerability and consequences (R = T × V × C), defined above. Because the CISR-RMP estimates these ele-ments as point values, the product function is used: R=T×V×C. For simplicity, the three main terms, T, V and C (and its subset O) terms are considered as independent.4 Fragility (F), the ex-pected value of service outage (O),a metric of resilience,5 is also estimat-ed. Risk may be to either the CI or to the regional public, by defining con-sequences as to the CI or public, respectively.6

4 Threat likelihood, vulnerability and consequences are seldom independent when consider-ing an adaptable adversary. Assuming that they are is a commonly used simplification that has generated substantial professional controversy. The principal dependency of concern is that a thinking adversary will adapt to the defender’s risk mitigations and the defender will respond with different mitigations, followed by the adversary’s additional adaptation, etc., etc. The academic solution to this issue is to use game theory, which is designed to solve specifically this class of problem. The technique, however, is far too complex and difficult for the decision environment addressed in this report. The Common Risk Method devel-oped by the U. S. Army Corps of Engineers and the AWWA J100 -10 proxy method both address this by noting that Threat Likelihood is, in fact, a choice made by the adversary – the likelihood the adversary will choose any particular asset target and attack mode, i.e., the scenario of interest. The adversary’s pay-off is approximated by previously estimated Consequences and Likelihood of Success – the definition of Vulnerability when it pertains to malevolent threats. By making Threat Likelihood at least partially based on Vulnerability and Consequences, the central dependency is captured, at least crudely, in a more intui-tively understandable way. When risk mitigation decreases Vulnerability and/or Conse-quences, the threat-asset combination becomes less attractive, hence, less likely. Condi-tional risk methods ignore these issues. 5 Fragility, the expected value of service outage, is a measure of resilience of an operating system at the threat-asset level. Fragility is to resilience as risk, the expected value of loss, is to security. Security and resilience are enhanced as risk and fragility are reduced. Perfect resilience would be zero fragility – no service interruption at all. Fragility, like risk, is an ex-pected value. It is defined as the service outage (daily outage amount times the number of days) weighted by the same threat likelihood and vulnerability as the associated threat-asset risk. Fragility reduction is, then, a subset of risk reduction mathematically, but, as an objective in its own right, it often suggests different options from risk reduction, e.g., back-up service-sharing plans with nearby systems may not much reduce the risk of loss, but it could significantly increase resilience. Conversely, reliance on risk-transfer through insur-ance could significantly reduce the risk of loss, but does little to sustain the provision of vital lifeline services. 6 Consequences to the CI or other enterprise are those that directly impact the organization on a cash-forward basis, including repair and replacement, lost revenue, liabilities for casu-alties on- and off-site, contractual or governmental fees or fines for environmental damag-es, etc. All losses are taken after any insurance pay-outs. For for-profit CIs and other enti-ties, these are estimated after taxes and lost depreciation is taken into account. Conse-quences to the public include lost gross regional product, the statistical value of life for any human casualties and the sum of all CI losses (except gross revenue) after full interde-pendencies are taken into account.

4. Implement Risk Management Activities – develop options to reduce unac-ceptable risk and enhance resilience by reducing threat likelihood, vulner-ability or consequences (including outages); evaluate their life-cycle bene-fits relative to their life-cycle costs; choosing the options with the greatest net benefits within budget and other constraints; and implementing and managing the selected options.

5. Measure Effectiveness – estimate the extent to which the selected options were implemented according to plan and, much more importantly, whether in fact they have reduced risk and/or enhanced resilience and by how much.

5.2 Overall Logic and Levels of the CISR Risk Management Process The design began with the specification of an overall logic of risk manage-ment that could then be detailed for the user context and levels of organiza-tion. The highest level logic is that defined by the NIPP 2013 risk framework. This was then reconciled with the logic of THIRA and both of these with the logic of AWWA J100-10. The reason these two tools were chosen as models is their widespread application by lifelines and emergency managers and the fact that they come closest to meeting the ideal defined in the summary tool comparison. Figure 1 displays the high degree of comparability among the three risk management systems. The NIPP framework lays out the process at its broadest, largely qualitative expression, while THIRA adds specificity to issues to be considered in each phase and suggests more quantification. AWWA J100 adds substantive, complementary detail of method. For the two more detailed processes to fulfill the NIPP logic, an additional major phase was required, “Measure Effectiveness.” From a systems logic perspective, this phase is essential to differentiating what is working from what is not – an absolute requirement of any rational resource allocation process. It is notable that the highest level logic identified what all the examined risk management processes had missed.

Other risk management methods, e.g., proprietary and single-user unique tools can be evaluated to define whether their results can be com-pared and aggregated with those following this general logic.

All three of these approaches primarily consider the decision-making of a single entity, a CI or local agency, but a CISR-RMP business process must operate across entities and across levels of organization to deal with interde-pendencies, public interests and aggregation to policy and planning levels. For the present purposes, three levels are of greatest interest:

Figure 1: Alignment of NIPP 2013 Risk Management Framework, THIRA and AWWA-J100-10 and -15

Page 9: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

9

1. Individual CIs and emergency response enterprises, public and private, whose analyses and decisions necessarily take an internal, stewardship orientation, establish the current actual levels of CISR in each region;

2. Regional public-private coalitions or partnerships that facilitate cross-CI cooperation, facilitate information sharing and analysis of dependencies and interdependencies, and conduct regional analyses from the regional public’s orientation; and

3. State and/or federal government agencies that set policy and guidance; develop tools and techniques; provide direct support, including training, technical assistance and quality assurance (TTA&QA); and aggregate risk, resilience, benefits and costs to state, multi-state regions and national to-tals for accountability and support to CISR policy and program decision-making at these levels.

The overall design task, then, is to define and describe a process that carries out the combined NIPP/THIRAJ100 five-phase logic across the three stakeholder levels with special attention to their direct interactions, while as-suring the process is defensible, repeatable and feasible in all its aspects. To do that, it was necessary to define detailed design specifications.

5.3 Detailed Design Specifications Appendix A summarizes the speci-fications drawn from a national policy review, the review of the conditions of use for a CISR-RMP, and the technical screening underlying the federal tool comparison. It contains the most important design specifications, while more detailed, technical ones are shown in Attachment 1 and Appendix E of Brashear, et al., 2015.

Based on these specifications, the project team designed a CISR-RMP in enough detail to specify the necessary components, their characteristics and functions, logical sequence and their interactions. The preliminary pro-cess integrates specific analysis-and-decision workflows divided into the five phases of the NIPP/THIRA/AWWA J100 logic. The phases are carried out through close interaction and information sharing among local lifeline enter-prises (public and private); regional consortia or public-private partnerships; and state and/or federal CISR programs. Each phase is defined by the oper-ational, analytic and decision tasks, and other information necessary to man-age CISR rationally and effectively.

5.4 Design Summary of CISR-RMP for the Single Enterprise Based on the design objectives and the tools that best meet the technical criteria, the project team developed an enterprise-level CISR-RMP, as summarized in Figure 2.7 The process parallels the five phases of the NIPP 2013 Risk Man-agement Framework through which the enterprise fulfills the objectives of the NIPP Framework while managing its own security and resilience in its unique situation. The work performed in each of its five phases includes: E.1 Define the enterprise’s goals and objectives based on its mission and

functions; prioritize them by systematically assigning relative importance weights; review the existing business processes to examine in-place risk processes and other processes that could contribute to the risk man-agement process as defined by this model CISR-RMP; plan the analysis and train the analysts, and select the threats and hazards of greatest concern from a standard threat/hazard set.

E.2 Identify and screen the systems, subsystems and assets that are crucial to the mission and functions, and compose threat-asset pair scenarios.

E.3 Calculate current and projected enterprise baseline risk and fragility (i.e., no new risk mitigation) for each threat-asset pair and aggregate them in a form useful for decision-making in the next phase. AWWA J100 provides detailed guidance on calculating baseline risk. Consistent with the enterprise’s stewardship role, the consequences of interest are those that directly affect the enterprise.

E.4 Sort the threat-asset pairs into those the enterprise will accept without treatment, those it will transfer through insurance and those it will act up-on. Develop mitigation/resilience options to address this last group, and estimate the amount the options will reduce one or more of the elements in the risk and fragility equations, re-estimating risk and fragility, then valuing the options from the enterprise perspective based on their net benefits8 and life-cycle costs. Select and implement those options that best meet the CISR goal (i.e., greatest net benefits) and other enterprise goals up to the budget constraint. Achieving these gross benefits is their enterprise outcome objectives, and, in aggregate, the enterprise CISR objectives.

7 For this report an “enterprise” is an individual CI, government agency, or other organized entity, public or private, that voluntarily chooses to use the CISR-RPM. 8 Net benefits are the difference between the threat-asset pair’s risk with the option and without it over its useful life (the gross benefit) less the life-cycle costs of the option. Where benefits and/or costs extend beyond the present year, both are estimated over time and discounted to present value.

Figure2:NIPP2013CIRiskManagementFramework&SummaryofSingleEnterpriseCISRRiskManagementProcess

Page 10: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

10

E.5 Evaluate the performance of the chosen options relative to their imple-mentation and operations plans and the progress they have made by re-estimating current actual enterprise outcomes of reduced risk and fra-gility based on the results of any real events (local or remote but similar) and local exercises; compare the actual performance to the enterprise’s baseline and objectives; and make mid-course corrections.

When it is possible to make use of the enterprise’s existing business pro-cesses, models and tools in planning and conducting CISR risk manage-ment, it eases the integration of the CISR-RMP into the on-going, routine business processes of the enterprise. Risk management ceases to be a spe-cial “one-off” event and becomes normal and routine. The process then re-peats and improves based on feedback, changing conditions and considera-tion of additional assets and hazards. The results of the process may be ag-gregated for local and higher-level decision-making in Phase 3 – baseline risk and fragility; Phase 4 – option val-uation and risk and fragility reduction objectives of op-tions; and 5 – actual, overall CISR progress from the original baseline and the degree actual outcomes met mitigation objectives.

Processes much like this are essential to success in industries where risk is a central part of their business models, e.g., pharmaceuticals; natural resource exploration and development; nuclear power generation; and, of course, insurance and re-insurance; among others. All of them use probabil-istic risk analysis conducted using ratio-scale metrics and standard threat events and other uncertainties. Many quantify not only the point estimates, but also the uncertainty in these estimates, a future step for the CISR-RMP. Even without this improvement, use of this general approach with these measurement scales is essential to meeting the diverse requirements of a CISR-RMP for interdependent enterprises and regional collaboration.

5.5 Design Summary of CISR-RMP Process for the Regional Coali-tion The enterprise-level process is linked to a regional process, as shown

in Figure 3. The regional process iterates between each of the enterprises and a voluntary regional coalition or public-private partnership through infor-mation sharing and collaboration. A formal information-sharing and protection agreement governs communications that flow between the two levels and among enterprises. Such an agreement and the interactive process are nec-essary to allow: (1) CI interdependencies analyses, (2) funding or cost-sharing of options with exceptional public benefits that the enterprises indi-vidually cannot justify, (3) the evaluation of actual outcomes of reduced risk and fragility, including interdependencies, and (4) aggregation of results at phases 3, 4 and 5 for reporting and accountability.

The work flow of the regional coalition (“the region”) parallels that of the enterprises and interacts with them as follows: R.1 Form or adapt a voluntary regional coalition through a series of meetings,

workshops and tabletop exercises for CI and local government managers to increase understanding that failures of lifeline infrastructures are major threats to everyone that cannot

be addressed by enterprises working alone; negotiate and adopt the information sharing and protection agreement; define and weight regional public goals and objectives; and select the threats and hazards from the standard set and reconcile them with those of the participating enterprises.

R.2 Identify regionally critical infrastructure systems and define threat-system scenarios as the basis for working with the enterprises to assure all regionally important threat-sysem scenarios are reflected in the enterprises’ threat-asset pairs.

R.3 Analyze dependencies, interdependencies and regional economic impacts using the results of the enterprise baseline risk and fragility analyses, then estimate an overall regional baseline risk and fragility from the perspective of the regional public; and aggregate it for use by

Figure3:NIPP2013CIRiskFramework&SummaryofEnterpriseandRegionalCISRRiskManagementProcess

The overall design task, then, is to define and describe a process that carries out the combined NIPP/THIRA/J100 five-phase logic across the three stakeholder levels with special attention to their direct inter-actions, while assuring the process is defensible, repeatable and feasi-ble in all its aspects.

Page 11: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

11

regional decision-makers and, in a summarized form, the general public. R.4 Re-analyze the dependencies, interdependecies and economic impacts,

assuming both enterprise-funded and unfunded options, by valuing all options from the public perspective. Some options with very large public benefits may be unfunded by enterprises because of insufficient direct enterprise benefits or falling below the enterpise’s budget constraint.9 These represent foregone public benefits that could be obtained by inducing the enterprises to accept top-ranked unfunded options. Inducements could be financial incentives, funded locally or from outside the community; regulations, building codes and land use zoning; privatizing parts of CIs, federal and state grants-in-aid; etc. In aggregate, the reductions in risk and fragility associated with the full set of funded options are the regional CISR objectives.

R.5 Evaluate the actual regional outcomes performance of all the implemented options, based on enterprise information and indepenent validation of the amount that aggregate risk and fragility have been reduced, to gauge the extent to which the region has made progress from the regional baseline and met its regional objectives. Aggregate regional performance for use at higher levels of government and with the general public. The participating enterprises use risk management process that are

logically and methodologically equivalent – i.e., all are versions of the model CISR-RMP, so their results are consistent and comparable – as customized for their existing internal processes, technologies, cultures and settings. By doing so, they voluntarily participate in the regional process because each stands to gain, potentially significantly, from the collective analysis of interdependencies; the possibility of external, incremental funding or cost-sharing; and the positive image of contributing to regional public well-being. Importantly, each enterprise benefits from the resilience of the region in which they operate. The regional coalition facilitates both integration of the enterprise analyses and collective decision-making to capture otherwise foregone public benefits. As experience accu-mulates, the regional coalition also becomes a shared trusted source of new ideas for cost-effective options and local information sources. Appendix B of this paper and Brashear, et al., 2015 (especially Appendix E) describe the CISR-RMP for enterprises and regional coalitions in greater detail.

5.6 Design Summary of CISR-RMP for the Federal/State Govern-ments Satisfying long-standing legislative requirements, the regional and enterprise aggregations can help state and/or federal agencies assess the effectiveness of their CISR programs that operate through the local and regional enterprises and coalitions. Conversely, state and federal CISR programs can contribute to the effectiveness of the enterprise and regional programs by performing a number of necessary functions in each phase of the process (Figure 4). State and/or national CISR programs: G.1 Begin each cycle by setting national and/or state goals, policies and

strategies; facilitate regional coalitions; develop and test methods and tools for use at all three levels; and train federal and state personnel who will provide training, technical assistance and quality assurance (TTA&QA), including validation of methods, data, assumptions and results, to regions and enterprises; and integrate timely, qualitiative and quantitative intelligence into the specifications of the standard threat and hazard set.

G.2. Conduct studies to identify infrastructures and systems with national or international criticality (e.g., the North American power transmission grids) and the threats or hazards with the greatest consequences to

9 This step addresses the classic problems of the “tragedy of the commons,” “co-benefits” and other externalities, public goods and other underinvestment in options with public bene-fits. Note that the enterprises are expected to make the investments that they can justify, which the regional coalition may confirm in shared analyses under the information sharing agreement. The sequence of decisions results in a form of “optimizing at the margin” across the enterprises and the region as a whole.

them, then advise the responsible enterprises and regions so they are certain to be addressed in their respective CISR analyses.

G.3. Analyze baseline dependencies and interdependencies of systems that are larger than regions covered by P3s and provide the results to P3 re-gional and enterprise interdependencies analysis; provide direct TTA&QA to enterprises and regions, including quantitative intelligence guidance to enterprises and regions on man-made threat likelihoods; de-velop new and improved tools and models; and provide incentives for en-terprises and regions to adopt CISR-RMP into their standard business processes.

G.4. Analyze dependencies and interdependencies of the larger systems, assuming implementation of all options, both funded by enterprises and/or regions and unfunded, valuing the unfunded options from the national public perspective to determine if significant national benefits would be foregone if they remain unfunded; provide grants, cost-sharing or other incentives to fund those with greatest national public net bene-fits; and provide quantitative intelligence support and direct TTA&QA to enterprises and regions analyzing their own sets of options.

G.5. Study actual events around the world for insight into vulnerabilities, con-sequences of various types and levels of attacks and natural events, as input to modeling and estimation; provide TTA&QA to enterprises and regions in evaluating their program outcomes (validation is most im-portant here); conduct R&D and field tests to improve the CISR-RMP (methods, models and data) and risk/fragility mitigation options; aggre-gate regional and state performance assessments to a national assess-ment that compares actual performance of the participating enterprises and regions against the national baseline for measuring progress against the baseline and performance against objectives for evaluating national

CISR programs; and submit peri-odic reports of all phases to the states’ governors and legislatures and the national Administration and Congress.

The lack of continuity of the federally sponsored lifeline risk/resilience analysis programs noted by both federal and local officials could be addressed by establishing a mutli-agency institutional base outside government. Potentially funded jointly by IP, FEMA, Government Services Agencies, the Environmental Protection Agency, and the Departments of Energy, Transportation, Commerce and/or Housing and Urban Development, an institute could be established under the administrative umbrella of an existing non-profit organization. The instititute would operate an open-source process of methods development and enhancement and a clearinghouse of effective implementation approaches. In addition, it might provide the required TTA&QA on behalf of the fereral government, allowing continuity of this function at local sites. Governance of the institute would be shared among the funding agencies, representative users at the local, regional and state levels and risk experts, both academic and practitioners. A standing management committee would oversee the program on a day-to-day basis.

The importance of state and/or federal (or institute) personnel providing TTA&QA is that it enhances the quality and consistency of the analyses; pro-vides local, continuing, no-cost advice and coaching; accelerates learning by enterprise and regional personnel; offsets some of the local costs; and inte-grates the analytical efforts of diverse enterprises and regions into coherent state and national programs. As methods are improved, the personnel re-sponsible for TTA&QA can disseminate and train users in them. A centrally directed TTA&QA capability is virtually universal among industries where risk management is essential to their success.

The lack of continuity of high quality federally sponsored lifeline risk/resilience analysis programs noted by both federal and local officials could be addressed by establishing a mutli-agency, multi-year institutional base outside government.

Page 12: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

12

Evidence with related tools indicates that when federal tools are accom-panied by TTA&QA, significant numbers of enterprises and local govern-ments provide access, information and their expertise in their own systems. According to GAO (2015), in Fiscal Years 2011 through 2013, PSAs per-formed 3,255 assessments, the Federal Protective Service performed 1,458 and TSA performed 545. During the same period, the Coast Guard directly performed 93 risk analyses and oversaw up to 3,500 assisted self-analyses using the ordinal risk tool, MSRAM. THIRA, the conditional risk tool by FEMA with essentially complete market penetration for its target audience, is sup-ported by annual training programs and is required of all states and UASI regions that desire to participate in certain FEMA grant programs. This evi-dence suggests that active, well-supported supported federal involvement is necessary to move technical CISR risk assessment tools of any type into widespread use by targeted users, but that it clearly can be done.

The above description of the CISR-RMP in this report should be recog-nized as the “snapshot” frame from the “moving picture” of risk/resilience management advancement. The process is fully expected to continue to change and adapt to new methodological insights and deeper understanding of the challenges faced by diverse lifelines and other infrastructures, local and state governments, regional coalitions and the national government. This

report is simply a point along that developmental continuum. If the national program is built around the concepts of open-source software, it would be able to very rapidly iterate to incorporate improvements based on the experi-ence and creativity of an active community of users; adapt to unforeseen circumstances and additional sectors; and enhance analysts’ and decision-makers’ abilities to tailor the process to their own needs, while maintaining the consistency of the process and comparability of results. If a multi-agency institute were established for TTA&QA, it could support the open-source pro-cess to rapidly solve emergent challenges and exploit advances in method-ology, implementation practices and substantive, cost-effective CISR options.

6 Roadmap to Implementation As summarized in Section 5, the CISR-RMP is a model business process, not a tool. Its functional logic and princi-ples may be implemented through a variety of tool configurations and still serve the purposes of enterprise risk management, interdependencies analy-sis and integrated regional public-private collaboration in the public interest. This notion is central to the implementation approach outlined in this section.

6.1 Implementation Innovation The project concluded with a “roadmap” to further operationalize the risk management process by simulta-neously closing the most critical component gaps and suggesting a novel

Figure4:NIPP2013Framework&SummaryofEnterprise,Regional,Stateand/orFederalCISRRiskManagementProcess

Page 13: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

13

way of initiating CISR-RMP implementation in the field. As with the process design itself, the implementation approach is a balancing of the “ideal” – all users apply the same CISR-RMP in the same way – and the “pragmatic” – users design and incorporate their own CISR-RMP functionality as they see fit. The former would imply a degree of coercion incompatible with the collab-orative approach described in the policy documents and NIPP 2013. The latter risks the rise of indefensible methods and non-comparability of results, foregoing the ability to conduct interdependencies analysis, regional analysis, cross-sector comparisons and aggregation – all key design objectives.

As noted in Sections 3 and 4, use of federally sponsored risk analysis tools has often been limited to requirements compliance, sometimes be-grudgingly, rather than as major drivers of risk-based decision-making. Users expressed substantial frustration that federal personnel and contractors try to impose new approaches without appreciation for the local situation or the expertise, tools and processes already in use. The CISR-RMP represents a model approach, but it provides the flexibility for implementation in a wide variety of forms while still enabling users and regional coalitions to meet the objectives requiring direct comparability. A promising but untried approach would be to find ways to start with and adapt the existing business processes to incorporate the essential elements of the CISR-RMP model into the users’ routine management processes. Ideally, this would result in sustained and routine use as part of planning and resource allocation. Lifelines, local juris-dictions and regional coalitions would make coordinated decisions that de-termine security and resilience in ways that raise the rationality of the overall regional and, hence, national effort.

Unlike the typical “top down, outside-in” federally sponsored, lo-cal/regional security and risk programs, an organic, “bottom-up, inside out” business process engineering approach is suggested. It recognizes that user organizations possess unique and valuable knowledge, processes, models (both digital and mental) and relationships. It builds directly on the tools and data already being used by the lifeline CIs and allows them to choose the particular CISR-RMP tools to use and integrate into their on-going process-es. Where existing tools are misleading or inadequate, externally provided tools may be suggested for specific reasons, not simply imposed for uniformi-ty. This is a business process engineering approach to build on existing pro-cesses while evolving toward the CISR-RMP model as the “ideal.” Integral to success is development of a stakeholder-validated implementation strategy in the initial stage of the process: the users must be “in charge” by being called upon to make specific pro-cess implementa-tion decisions nec-essary to imple-ment the CISR-RMP. Clearly, such an innovative implementation strategy requires developmental field pilot projects, as well as work to narrow any major gaps in the overall design.

6.2 Design Specifications and Gap Narrowing: Case Studies The CISR-RMP design summarized here meets nearly all of the detailed design specifications (Appendix C displays how). In stating the process “nearly” meets all the criteria, several “gaps” were identified, three of which are critical to address prior to initiating fieldwork: Information sharing and protection protocols with appropriate security and

legal safeguards to enable CIs and local agencies to share sensitive in-formation with one another to support interdependencies analysis and col-laborative decision-making;

Understanding of the implementation of extant risk management process-es and their relationship to other existing, potentially related business pro-cesses, e.g., asset management (including major rehabilitation and re-placement decisions selection of design options), continuity planning, capi-tal development planning and budgeting and operational planning and budgeting – such understanding being essential to refining the organic im-plementation process; and

Interdependencies analysis methods or models that make use of the shared information from both other CIs and other processes and extend the risk analysis to the regional and local levels.

Narrowing the first two gaps can most effectively be effected through case studies of information sharing and risk management and related pro-cesses that organizations and regional collaborations are currently employ-ing. The third will require both model development and reality testing by rec-onciling the data requirements of the interdependencies modeling with the data availability as established by the CISR-RMP and extant systems and the information-sharing protocol employed.

The first two efforts should proceed simultaneously as preparation for planning the developmental field pilots requires input and feedback from both processes. The rest of the major gaps can be reduced in the course of a de-velopmental field-based pilot, but these two must be resolved before such a pilot can be undertaken. One result of these case studies would be to refine and add detail to the organic implementation approach.

6.3 Developmental Field-Based Pilots Once the major gaps are nar-rowed and the organic implementation approach is better defined, an initial regional pilot project should be conducted in a region where multi-stakeholder CISR-focused partnerships or other collaborative mechanisms are already in place. As the initial pilot matures, two or three others should be initiated to test the generality of the process. These pilot tests would test the collaborative organic implementation approach and validate the feasibility and effectiveness of the CISR-RMP model process itself. The results would be used to enhance the CISR-RMP framework and its implementation ap-proach. In the initial phase, the project team would work with users to review the users’ existing risk management and related processes relative to the “pragmatic ideal” of the CISR-RMP to determine: (1) where, if anywhere, the extant risk management processes might be improved by evolution toward the CISR-RMP, and (2) whether the products of their existing or modified processes are consistent enough with those of other users of the CISR-RMP process to support interdependencies analyses, comparisons and aggrega-tion. Where this review suggests changes to a user’s existing processes, the user would be presented available options (pre-screened for effectiveness and consistency with the CISR-RMP) and the user would decide among them. The user would be responsible for acquiring, integrating and applying the chosen options, with continuing support from the CISR-RMP pilot team

initially, phasing out in favor of well-prepared personnel responsible for TTA&QA or other appropriate experts could provide the needed, very-specific assistance. Based

on the results of these pilot tests, a detailed “roll-out” implementation plan would be prepared for federal “go/no-go” decisions.

7 Benefits of Developing and Demonstrating the CISR Risk Manage-ment Process Successful completion of this roadmap, including a national roll-out, will result in a number of direct and indirect potential benefits to the nation, its regional communities and its lifeline infrastructures. The CISR-RMP as described: Supports the whole decision cycle: (1) sets security and resilience priori-

ties, (2) evaluates and selects improvement options and (3) manages im-plemented options by using actual, measured performance.

Encourages full integration of the CISR-RMP functionality into the on-going routine business process of its users, so it can be sustained and routinely applied.

Supports decisions over the long term (capital plans and budgets), near term (operating plans and budgets) and real-time (situational awareness and incident and restoration management).

Quantifies true outcomes terms: resilience (expected outage), security (risk), benefits and progress, rather than intermediate “output” or vague

Both the CISR-RMP design and the suggested implementation approach balance the “ideal” – to make the process fully effective in allocating resources for the greatest ben-efit, using state-of-the-art risk management, applying identical tools in the identical way – and the “pragmatic” – to make the process simple enough to be applied, understood, integrated and adapted into existing management processes and used routinely by staffs and management of CIs, local governments and regional coalitions – as they see fit.

Page 14: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

14

indices. Facilitates efficient, rational decisions because benefits are clearly defined

and expressed in dollar terms, both prospective and actual, so results can be compared in conventional net-benefit, return-on-investment and bene-fit/cost analyses to support budget allocation decisions.

Performs technically correct analyses using transparent and simple tech-niques so that engineering, operations and/or management personnel of CIs and local governments can conduct the analyses and interpret the re-sults themselves without the need for outside experts, making the results credible to decision-makers.

Mobilizes and coordinates private, utility, state and local funds and gener-ates information necessary for federal assistance and innovative finance.

Models and manages interdependencies among infrastructures that can potentially cause cascading impacts on other infrastructures, their cus-tomers and the region.

Analyzes the consequences of impairment of cyber and manual process control systems.

Synthesizes descriptions of evolving crises for situational awareness and models alternative response plans before and during a crisis.

Sequences facility restarts and service restoration after disasters. Incorporates man-made, technological and accidental, natural, proximity,

dependency, aging infrastructure and cyber threats. Supports analysis and options for adaptations to climate changes in terms

of sea level rise and increased severity and frequency of major storms, droughts, etc.

Establishes an open and competitive environment for development of al-ternative tools that assist in carrying out the functionally and results-consistent process, which could stimulate significant new offerings by software developers and consulting firms.

Provides common, natural metrics necessary to measure progress for in-frastructure and regional managers, federal and state grant programs, in-surers, credit-rating agencies, etc.

Supplies an integrating ana-lytical structure for holistic, solutions to local challenges and for these “bottom-up” solutions to be aggregated and integrated for state and truly national programs.

Motivates public-private and private-private partnerships around common, measured resilience, security and value objectives and action programs.

Complements the “vertical” sector structure of the NIPP by providing “hori-zontal” integration of CIs, state and local governments and their stake-holders in every participating metro area and community.

Provides a platform for local implementation of climate change mitigation programs.

Complements implementation of PPD-8 – Preparedness; the National Preparedness Goal; and the National Preparedness System, especially in the Protection, Mitigation and Recovery mission areas.

Operationalizes the risk management framework defined in NIPP 2013 (and NIPP 2009).

Implements key elements of the DHS/IP Strategic Plan: 2012 – 2016. Fulfills the recommendations of the State, Local, Tribal and Territorial

Government Coordinating Council, the Regional Coalition Coordinating Council, and several DHS and Presidential advisory groups.

Meets recommendations of the Homeland Security Advisory Committee for an American Resilience Assessment methodology and toolkit.

Accords with recommendations of the National Research Council and sev-eral other expert groups’ recommendations and with most of the relevant DHS plans, frameworks and policy.

This “pragmatic-ideal” balancing approach of both the process model and its innovative implementation operationalizes the voluntary and collaborative nature of the plans and systems flowing from PPDs -8 and -21. It also allows collaboration with several federal programs designed to manage aspects of

infrastructure and community risk. Success in this approach could lead to the CISR-RMP’s becoming a sustained, inherent part of routine management processes of CIs, local governments and regional coalitions, the place where it must be sustained and effective in truly increasing critical infrastructure and regional community security and resilience.

Acknowledgement The research reported here was sponsored by the U. S. Dept. of Homeland Security, Office of Infrastructure Protection, under Contract No. HSHQDC-14-C-00089; the opinions are the authors’. The full report, cited in references as Brashear, et al., 2015, may be obtained at http://www.nibs.org/resource/resmgr/IRDP/IP_CISR-RMP_FnlRpt.pdf.

References ANSI/AWWA. 2010. J100-10 Risk Analysis and Management for Critical As-set Protection (RAMCAP) Standard for Risk and Resilience Management of Water and Wastewater Systems, an American National Standard, AWWA, Denver, CO. ASME-ITI. 2007b. Sector-Specific Guidance: Water and Wastewater Sys-tems, ASME, New York, NY. ASME-ITI. 2009. All-Hazards Risk and Resilience: Prioritizing Critical Infra-structure Using the RAMCAP Plus Approach, ASME, New York, NY. Brashear, J.P., et al., 2011. Regional Resilience/Security Analysis Process for the Nation's Critical Infrastructure Systems, ASME, New York, NY. Brashear, J.P., Scalingi, P.L. and Colker, R., 2015. A Business Process En-gineering Approach to Managing Security and Resilience of Lifeline Infra-structures, National Institute of Building Sciences (NIBS), Washington. Cox, L.A., 2008b. Some Limitations of "Risk = Threat x Vulnerability x Con-sequence" for Risk Analysis of Terrorist Attacks, Risk Analysis, 28 (6). Multihazard Mitigation Council, Natural Hazard Mitigation Saves: An Inde-pendent Study to Assess the Future Savings from Mitigation Activities, NIBS, 2005.

Novosel, D., et al., IEEE Report to DOE QER on Priority Issues, IEEE Joint Task Force on Quad-rennial Energy Review, Washing-ton, September 5, 2014. Stevens, S. S., 1946. “On the Theory of Scales of Measure-

ment,” Science, June 7, Vol.103, No. 2684: 677–680. U.S. Department of Homeland Security, 2011. Risk Management Fundamen-tals: Homeland Security Risk Management Doctrine, Washington, DC. U.S. Department of Homeland Security, 2013a. NIPP 2013: Partnering for Critical Infrastructure Security and Resilience, Washington, DC. U.S. Department of Homeland Security, 2013b. NIPP Supplemental Tool: Executing a Critical Infrastructure Risk Management Approach, Washington, DC U.S. Presidential Executive Order (EO) 13636: Improving Critical Infrastruc-ture Cybersecurity, February 12, 2013. Available online: https://www.whitehouse.gov/the-press-office/2013/02/12/executive-order-improving-critical-infrastructure-cybersecurity. U.S. Presidential Policy Directive/PPD-8: National Preparedness, March 30, 2011.Washington, DC. U.S. Presidential Policy Directive/PPD-21: Critical Infrastructure Security and Resilience, February 12, 2013.Washington, DC. Weiss, D.J. and Weidman, J., “Disastrous Spending: Federal Disaster-Relief Expenditures Rise amid More Extreme Weather,” Center for American Pro-gress, accessed May 18, 2015 at https://www.americanprogress.org/issues/green/report/2013/04/29/61633/disastrous-spending-federal-disaster-relief-expenditures-rise-amid-more-extreme-weather/

This “pragmatic-ideal” balancing approach of both the process model and its innovative implementation operationalizes the volun-tary and collaborative nature of the plans and systems flowing from PPDs -8 and -21.

Page 15: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

15

Appendix A. Detailed Design Specifications by Source

Basis of Specifica-tion No. CISR-RMP Design Specifications Federal Policy 1. CI risk estimated by identifying what assets are critical, taking interdependencies into account NIPP 2013 2. Threat, vulnerability and consequences to support rational choices among action options 3. Selected options implemented & their performance evaluated. 4. Include physical, cyber and human assets NIPP 2013 5. Documented – self-documenting, fully explicit; decision-oriented Supplemental 6. Reproducible – measurement reliability; comparable/consistent across time; minimum subjectivity 7. Defensible – integrated & compliant with standards of risk & uncertainty management disciplines Risk Mgmt 8. Unity of Effort – holistic integration & synchronization of entities w/ risk-mgmt responsibilities Fundamentals 9. Transparency – clear, open and direct communications 10 Adaptability – dynamic & responsive to changing conditions and improving methods 11. Practicality – simple & useable, given analytic/data limitations, organizational & political realities 12. Customization – common analysis but local choices, designs of improvement options Implicit 13. Accountability – measurement & reporting of actual results in improved risks & resilience 14. Advance PPDs 8 & 21, CISR R&D Plan, IP Strategic Plan for local/regional integrated programs

Technical 15. Set goals, objectives & priorities (weights) systematically Defensibility 16. Standardized threat/hazard set that is mutually exclusive and collectively exhaustive 17. Asset criticality based on mission 18. Risk = Threat Likelihood ×Vulnerability ×Consequences, all in ratio scales, $, casualties, other 19. Resilience measured in ratio scale (preferably based on expected outage, fragility) in units & $ 20. Common definitions, process & threats for consistent, comparable metrics across sectors 21. Meet all conditions for meaningful aggregation within/across sectors and to higher levels 22. Uncertainty explicitly treated using at least sensitivity analysis 23. Dependencies & interdependencies modeled explicitly 24. Options based on site/design/construct, prevent, protect, mitigate, respond & recovery 25. Explicit valuation of risk/fragility reduction benefits & life-cycle costs 26. Rational resource allocation to options 27. Managed, monitored & documented implementation and operations of selected options 28. Resources allocated so incremental benefits are paid by CI, local govt, regional P3, state, federal 29. Explicit performance evaluation of amount of risk- & fragility-reduction achieved 30. Full uncertainty with Monte Carlo simulation or risk & expected outage, with interdependencies

User Design Specs 31. Model protocol for information sharing 32. External initiation by recognized authority, e.g., industry standard, state or federal standard 33. Easy to use, free or low-cost system, with improvements through open process 34. Enable analysis to address internal business case and regional community case simultaneously 35. Provide immediate and obvious value to CI & local gov’t decision-makers 36. Analysis conducted by employees of CIs & local agencies, with training and technical assistance 37. Standard threat/hazard set including, especially, weather hazards due to climate change 38. Common analytical process for dependencies, but not mandated solutions 39. Low or no-cost technical assistance from local, state or federal employees trained in depth 40. Liability resolution for untreated risks accepted in a rational, analytically based trade-off process 41. Address major dependencies & interdependencies in fully protected information sharing process 42. Integrate with extant asset mgmt, planning, budgeting, development systems of CIs, local govt

* Certain desirable specifications, such as full capture of uncertainty and correlations in estimates, Monte Carlo combination of re-sults, output as distributions and portfolio optimization, are deferred for the near term as requiring too much user education to be in-cluded at present. These should be developed for the future and introduced as user sophistication grows.

Page 16: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

16

Page 17: Rare is the infrastructure that uses risk management on a regular … · 2018-04-04 · 2013), the Threat and Hazard Identification and Risk Assessment (THIRA) sponsored by the Federal

Discussion Draft

17

Appendix C. CISR-RMP Design Specifications Fulfilled by CISR Risk Management Process [Entries in Red Indicate That the Specification Warrants Additional Development]

Basis of Specification # CISR-RMP Design Specifications (Abbreviated from Table 3)

Features of CISR-RMP to Meet Design Specifications Read- iness

Federal Policy Design Specs 1 CI risk for critical assets & interdependencies Criticality based on role in carrying out core mission; interdependencies modeled explicitly 10, 7

NIPP 2013 (from PPD-21) 2 R = T×V×C; rational resource allocation R=T×V×C for each, CI & region; rational resource allocation based on net benefits to CI &region 10,10 3 Selected options implemented & performance evaluated Implementation monitored & actual reduction in risk & fragility measured 10,10 4 Include physical, cyber & human assets Physical & human explicitly treated; cyber treated as loss of automated control & according to Cyber Frwk 9 NIPP 2013 Supplemental 5 Documented –fully explicit; decision-oriented Self-documenting in use; whole analysis oriented to 3 core decisions: TA pairs to analyze, options, eval. 10 6 Reproducible –reliability; comparable/consistent data Consistency/comparability rigorously controlled 10 7 Defensible – integrated & compliant with risk disciplines Meets basic tenets, with purposeful (temporary) simplification to aid introduction & initial use 10 Risk Mgmt Fundamentals 8 Unity of Effort – holistic integration & synchronization Common process for all lCs & local agencies with explicit regional depend./interdepend. & all 3 decisions 8 9 Transparency – clear, open and direct communications Clear process & measurements, protected direct communications on regional scale 10 10 Adaptability – dynamic & responsive Process explicitly open for expected improvements & adaptations to emerging threats & hazards 10 11 Practicality – simple & useable, given realities Readily useable by local staff (when trained & assisted), practical level of initial modeling 8 12 Customization – common analysis but local choices Common, consistent process but complete openness to locally designed risk- & fragility-reduction options 10 Implicit 13 Accountability – measure actual improved risks/resilience Same methods from baseline & investment decisions evaluate actual risk/fragility reduction; true outcomes 8 14 Advances PPDs 8 and 21, CISR R&D Plan & IP Strat. Plan Practical yet rigorous risk basis for all pieces – includes both CIs and community emergency resp./recov. 9

Technical Defensibility Specs 15 Goals, objectives & systematic weights Goals, objectives & priority weightings using AHP 10 16 Standardized threat/hazard set; likelihood of man-made Standardized mutually exclusive, collectively exhaustive threat/hazard set; locally adaptable 10, 2 17 Asset criticality based on mission Explicit asset identification & criticality assignment based mission and gross consequences of loss 10 18 R = T×V×C, all on ratio scales, in $, casualties, other R = T×V×C, all in ratio scales, all in point estimates (with sensitivity analysis); later probability distributions. 10 19 Resilience measured on ratio scale, in units & $ Resilience measured by Fragility = Outage ×V×C, Outage = Duration × Severity; all ratio point estimates 10 20 Consistent, comparable metrics across sectors Common process has been used in water, roads/bridges, electricity distribution, emergency ops & comm. 9 21 Meaningful aggregation within/across sectors & levels Expected values on ratio scales may be added because the necessary conditions all met 10 22 Uncertainty explicitly treated Uncertainty analyzed by sensitivity analysis for decision-change 6 23 Dependencies/interdependencies & regional economics All consistency requirements met; actual modeling in progress 3, 7 24 Options from design/construct/prevent/protect/mitigate/R/R Full spectrum of potential options explicitly considered 10 25 Explicit option valuation in $ of benefits & life-cycle costs Options valued by consistent life-cycle net benefits & costs from both CI and regional public’s perspective 10 26 Rational resource allocation to options, $ to $ Joint-benefit options explicitly analyzed; rational trade-off analysis at both CI & regional levels 9 27 Managed, implementation and operations of options CI’s routine accounting & project management techniques for implementation 10 28 Incremental $ to incremental benefits by level, CI to Federal Mobilizes private, utility, local public, state & Federal $ in sequence to apply incremental $ to incr. benefits 9 29 Explicit performance evaluation risk/fragility reduction, in $ Actual experience (local & other) plus exercises & red-teams support full actual risk/fragility measurement 7 30 Full uncertainty & Monte Carlo simulation of risk/fragility Minimal acceptable by risk discipline, but deferred in favor of user acceptance based on point estimates 3

User Design Specifications 31 Model protocol for information sharing Several regions have developed such, but have not been synthesized or legally vetted 7 32 External initiation by recognized authority THIRA, J100, and several Federal indicator or ordinal tools have been accepted and are in use 10 33 Easy to use, free or low-cost open system, THIRA, J100 are being used without complaint about cost or effort, but could be circumstantial 9 34 Internal business case & regional community case, both in $ Provides both based on analysis of same threats, vulnerabilities, different perspectives on consequences 9 35 Immediate and obvious value to decision-makers Being used by decision-makers in several CIs 9 36 Conducted by employees of CIs & local agencies Being done by local employees with and without outside consulting experts 10 37 Standard threat/hazard set including climate change Standard threat/hazard set, with local modifications, well accepted; climate-related threats major concern 10 38 Common analytical process for dependencies CISR-RMP has set the conditions, esp. common, consistent estimates, but tool must be developed/tested 5 39 Low or no-cost Federal/state technical assistance Positive examples from PSAs, FPS, FEMA, TSA; existing Federal/state personnel could be trained (?) 8 40 Liability resolution for untreated, analyzed risks accepted Major challenge beyond scope of present project; needs major effort 2 41 Major interdependencies information sharing Solid examples in several regions, but their analyses are rudimentary; adequate tool development key 5 42 Integrates with extant asset mgt./planning/budgeting systems Uses same metrics as these systems and complements asset management by quantifying risk & fragility 6