hro basics: history, culture, systems, and … characteristics of hros • hyper-complexity •...

48
HRO Basics: History, Culture, Systems, and Leadership Janice N. Tolk, Ph.D. © Janice N. Tolk 2016. All Rights Reserved.

Upload: nguyennguyet

Post on 05-May-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

HRO Basics: History, Culture, Systems, and Leadership

Janice N. Tolk, Ph.D.

© Janice N. Tolk 2016. All Rights Reserved.

Normal Accident Theory

• Normal Accidents: Living with High-Risk Technologies by Charles Perrow

• Systems accident – unanticipated interaction of multiple failures

• Interactive complexity – system contains unfamiliar, unplanned, or unexpected sequences that are not visible and not immediately comprehensible

• Tight coupling – time dependent pressures that cannot wait for a response, sequences are invariant, one way to reach a production goal

• Normal Accidents are inevitable in interactively complex and tightly complex systems

• People make mistakes

• Big accidents can escalate from small initiators

• Failures are based on human error, system behavior, and technology2

Perrow (1999) © Janice N. Tolk 2016. All Rights Reserved.

Individual Accident

• Industrial Accident - The worker is harmed by the plant (e.g., radiation exposure, burns, trips, slips, falls, etc.)

• Focus: Protect the worker from the plant.

3

Plant(Hazard)

Humans(Receptor)

US DOE (2008)© Janice N. Tolk 2016. All Rights Reserved.

System Accident

• System Accident – The system fails, allowing a threat to release the hazard and adversely impact many people (e.g., workers, enterprise, and surrounding environment).

• Focus: Protect the plant from the threats.

4

Plant(Hazard)Human Errors

(Threat)Equipment, tooling, facility malfunctions

(Threat)

Natural Disasters(Threat)

US DOE (2008)© Janice N. Tolk 2016. All Rights Reserved.

System Accident

• Important!

• Emphasizing the system accident in no way diminishes the importance of individual safety; rather, it is a pre-requisite of an HRO.

• Focus on individual accidents is not enough – HROs consider ALL aspects of the system and its interactions from every direction.

• Operating as an HRO incorporates the safety programs already resident in an organization; i.e., industrial safety, radiation safety, chemical safety, etc.

• In addition to safety programs, HRO incorporates many existing management programs; i.e., performance metrics, reporting, lessons learned, accident investigations, etc.

5

US DOE (2008)© Janice N. Tolk 2016. All Rights Reserved.

HRO Theory

• University of California – Berkeley - Dr. Karlene Roberts, et al.

• Studied organizations that were effectively managing and operating complex and intrinsically hazardous technical systems

• Are there high risk organizations that have operated nearly “error free” over long periods of time?

• If so, what do these organizations do to reduce the probabilities of serious error?

• Counter-argument to Perrow’s Normal Accident Theory

• Research question: “How many times could this operation have failed with catastrophic consequences that it did not fail?”

• Dubbed these organizations “high reliability organizations”

6

Roberts (1989) © Janice N. Tolk 2016. All Rights Reserved.

NAT vs. HRO

• The Limits of Safety by S. D. Sagan

• Studied differences between NAT and HRO theories based on nuclear weapons organizations

• Competing general expectations

• Both theories agree that interactive complexity and tight coupling can, in theory, lead to a system accident

• NAT theorists believe system accidents are inevitable

• HRO theorists believe organizations can control the environment and prevent such an accident

• Organizational design

• Culture

• Management

• Human choice 7

Sagan (1993)© Janice N. Tolk 2016. All Rights Reserved.

HRO - Definition

• An HRO is ….

• An organization that can repeatedly (thousands of times) operate tightly coupled, interactively complex high risk processes without a catastrophic event despite significant hazards, time constraints, and complex technologies.

• Focused on system accidents

• An HRO is not …

• Every high risk operation

• It must exhibit all eight HRO characteristics simultaneously

• Focused only on individual accidents

• Nevertheless

• HRO theory has spread to many industries

• Health care, coal mining, petroleum, nuclear weapons, transportation, etc. 8

B&W Pantex (2011)© Janice N. Tolk 2016. All Rights Reserved.

Eight Characteristics of HROs

• Hyper-complexity

• Extreme variety of components, systems, and levels.

• Tight coupling

• Reciprocal interdependence across many units and levels

• Extreme hierarchical differentiation

• Multiple levels, each with its own elaborate control and regulating mechanism

• Large numbers of decision makers in complex communication networks

• Characterized by redundancy in control and information systems

9

Roberts & Rousseau (1989)© Janice N. Tolk 2016. All Rights Reserved.

Eight Characteristics of HROs

• Degree of accountability that does not exist in most organizations

• Substandard performance or deviations from standard procedures meet with severe adverse consequences

• High frequency of immediate feedback about decisions

• Compressed time factors

• Cycles of major activities are measured in seconds

• More than one critical outcome must happen simultaneously

• Simultaneity signifies both the complexity of operations as well as the inability to withdraw or modify operational decisions

10

Roberts & Rousseau (1989)© Janice N. Tolk 2016. All Rights Reserved.

Transportation SystemsDisplay Characteristics of HROs

• Transportation involves high-risk systems that are complex, tightly coupled, and interactively complex.

• There are large number of decision makers in complex communication networks.

• There are simultaneous critical outcomes

• Catastrophic events happen, and are typically the result of systems interactions as opposed to human error (individual accidents)

• Lac-Megantic Railroad Disaster

11

© Janice N. Tolk 2016. All Rights Reserved.

Six Actions for Managers

• Evaluate the cost of safeguards to prevent accidents vs. their cost (in terms of money, lives, and public outcry).

• Develop strategies (communication, temporary structures, etc.) to ameliorate against the negative effects of tight coupling.

• Evaluate system for interdependence and develop mechanisms (special units, etc.) for managing it.

12

Roberts (1989) © Janice N. Tolk 2016. All Rights Reserved.

Six Actions for Managers (continued)

• Evaluate the cost of redundancy and training vs. the cost of accident recovery.

• Develop decision making strategies appropriate to operating complex technologies in time-dependent settings.

• Analyze the components of high-reliability cultures (such as task, security, safety, accountability, and responsibility) and how to build them into an organization.

13

Roberts (1989), Sagan (1993), Rochlin (1996) © Janice N. Tolk 2016. All Rights Reserved.

Mindfulness

• Managing the Unexpected by Karl Weick and Kathleen Sutcliffe

• Weick differentiated human reliability from high reliability

• Human reliability concerned with human performance – perform an activity correctly

• HROs work to prevent human error from becoming a system error

• Weick and Sutcliffe characterized HROs by presence of five hallmarks that facilitate problem detection and organizational management

14

Weick (1989), Weick & Sutcliffe (2001) © Janice N. Tolk 2016. All Rights Reserved.

Five Hallmarks of HROs

• Mindfulness

• Anticipating the unexpected

• Pre-occupation with failure

• Reluctance to simplify interpretations

• Sensitivity to operations

• Containing the unexpected

• Commitment to resilience

• Deference to expertise

15

Weick & Sutcliffe (2001) © Janice N. Tolk 2016. All Rights Reserved.

Characteristics, Hallmarks, Actions?

• Eight characteristics defined what Roberts, et al., believed differentiated the research subjects as HROs

• Characteristics that could lead to a system accident

• Hyper-complexity

• Tight coupling

• Compressed time factors

• Multiple critical simultaneously outcomes

• Note – these exceed the characteristics defined by Perrow

16

© Janice N. Tolk 2016. All Rights Reserved.

Characteristics, Hallmarks, Actions?

• Eight characteristics defined what Roberts, et al., believed differentiated the research subjects as HROs (continued)

• Characteristics that could prevent a system accident

• Extreme hierarchical differentiation

• Large number of decision makers in complex communication networks

• High degree of accountability

• High frequency of immediate feedback about decisions

• What this amounts to …

• Multiple alert and trained eyes, ears, and minds monitoring and responding continuously and promptly

• Organizational culture conducive to prompt action, reporting anomalies, and quick response under upset conditions 17

© Janice N. Tolk 2016. All Rights Reserved.

Characteristics, Practices, Actions?Characteristics (Roberts & Rousseau) Hallmarks (Weick & Sutcliffe) Actions (Roberts)

Hyper-complexity Pre-occupation with failure Cost of safeguards vs. cost of accidents

Tight coupling Reluctance to simplify Strategies to ameliorate against effects of tight coupling

Compressed time factors Sensitivity to operations Evaluate systems for interdependence and manage it

Multiple critical simultaneous outcomes

Resilience Cost of redundancy and training vs. cost of accident recovery

Extreme hierarchical differentiation Deference to expertise Decision making strategies for operating complex, time dependentoperations

Multiple decision makers in complexnetworks

Build high reliability culture characteristics into the organization

High degree of accountability

High frequency of immediate feedback18

© Janice N. Tolk 2016. All Rights Reserved.

Safety Culture and HRO

• HRO depends on an organizational culture of mindfulness

• Mindfulness needs to be part of the organizational culture

• Developed by top management’s beliefs, values, and actions reinforced through communication

• Safety culture is a subset (sometimes) of organizational culture

19

© Janice N. Tolk 2016. All Rights Reserved.

Organizational Culture Defined

A pattern of shared basic assumptions that was learned by a group as it solved its problems of external adaption and internal integration, that has worked well enough to be considered valid and, therefore, to be taught to new members as the correct way to perceive, think, and feel in relation to those problems.

20

Schein (2010) © Janice N. Tolk 2016. All Rights Reserved.

Observing Organizational Culture

21

Artifacts and Behaviors

• What You Do

Espoused Values and Beliefs

• What You Say You Are Going to Do

Underlying Assumptions

• What You Really Feel You Should Do

The key to understanding organizational culture is to understand the underlying assumptions.

Schein (2010)© Janice N. Tolk 2016. All Rights Reserved.

Role of Leadership and Culture

• The role of the leader

• Culture creation

• Culture embedment

• Culture change

• Culture creation

• Beliefs, values, and assumptions of founders

• Learning experiences of group members

• New beliefs, values, and assumptions brought by new members and leaders

22

Schein (2010)© Janice N. Tolk 2016. All Rights Reserved.

Primary Embedding Mechanisms

• Attention and control

• Reaction under crisis

• Resource allocation

• Role modeling, teaching, and coaching

• Allocation of rewards and status

• Recruitment, selection, promotion, and dismissal

23

Schein (2010)© Janice N. Tolk 2016. All Rights Reserved.

Secondary Embedding Mechanisms

• Organizational design and structure

• Organizational systems and procedures

• Rites and rituals

• Physical design

• Stories about events and people

• Formal statements of organizational philosophy, creeds, and charters

24

Schein (2010)© Janice N. Tolk 2016. All Rights Reserved.

Safety Culture Survey

• Safety Culture Indicator Scale Measurement System (SCISMS) developed for the commercial aviation industry based on research into safety culture and safety climate.

• Combined from theory and assessments of organizational accidents and surveys from nuclear power, manufacturing, military aviation, petroleum, and construction industries.

• SCISMS was validated and improved over a series of years

25

FAA (2008) © Janice N. Tolk 2016. All Rights Reserved.

SCISMS Constructs

• Organizational Commitment to Safety

• Safety values

• Safety fundamentals

• Going beyond compliance

• Operational Personnel

• Supervisors/foremen

• Maintenance supervision

• Trainers

26

Weigmann, von Thaden, & Gibbons (2007) © Janice N. Tolk 2016. All Rights Reserved.

SCISMS Constructs (continued)

• Formal Safety System

• Reporting system

• Feedback and response

• Safety personnel

• Informal Safety System

• Accountability

• Authority

• Employee professionalism

• Safety Behaviors/Outcomes

• Perceived personal risk/safety behavior

• Perceived organizational risk27

Weigmann, von Thaden, & Gibbons (2007)© Janice N. Tolk 2016. All Rights Reserved.

SCISMS and HRO

• Comparing SCISMS constructs with HRO hallmarks reveals similarities

• Organizational commitment to safety~ Deference to expertise

• Operations interactions ~ Commitment to resilience

• Formal safety system ~ Pre-occupation with failure

• Informal safety system ~ Reluctance to simplify

• Safety behaviors/outcomes ~ Sensitivity to operations

28

© Janice N. Tolk 2016. All Rights Reserved.

So What?

• If a company has a strong safety culture and is mindful of the unexpected, are they an HRO?

• Maybe!

• HRO theory is based on preventing systems accidents, so is safety culture and mindfulness enough?

• No – attention needs to be given to the systems as well.

29

© Janice N. Tolk 2016. All Rights Reserved.

So, What Else Besides Safety Culture?

• Four HRO Practices

• Practice #1 – Manage the systems, not the parts

• Practice #2 – Reduce system variability

• Practice #3 – Foster a strong culture of reliability

• Practice #4 – Learn and adapt as an organization

30

USDOE (2008)© Janice N. Tolk 2016. All Rights Reserved.

Deming’s Theory of Profound Knowledge

• Provides a foundation for the systems approach to managing high hazard operations

• Became the foundation to develop a process to make characteristics, practices, and actions proposed by HRO theorists actionable

31

© Janice N. Tolk 2016. All Rights Reserved.

Theory of Profound Knowledge

•Organizations have cultures that influence the system and the desired outcome

•Theory, prediction, action, and feedback is the basis of learning

•Statistical process control is the foundation of process optimization

•Organizations are systems that interact within their internal and external environments

Knowledge of Systems

Knowledge of Variation

Knowledge of

Psychology

Knowledge of

Knowledge

32

Systems Approach to Operate as an HRO

Deming (1994)© Janice N. Tolk 2016. All Rights Reserved.

Construct of the HRO

•Create a culture of reliability

•Encourage questioning and anomaly reporting

•Be alert to deviations

•Safety is a priority

•Create a learning organization

•Maintain a continuous flow of information

•Mine data for trends

•Understand the difference between work-as-imagined and work-as-performed

•Use a available information and data

•Use feedback to understand systems causes for anomalies

•Communicate!

•Organizations

• are complex systems

•Master the systems to reduce complexity and variability

•Map systems to decrease surprises

•Understand system constraints

HRO Practice 1: Manage the

system, not the parts

HRO Practice 2: Reduce

system variability

HRO Practice 3: Foster a

strong culture of reliability

HRO Practice 4: Learn and adapt as an

organization

33

Systems Approach to Operate as an HRO

USDOE (2008)© Janice N. Tolk 2016. All Rights Reserved.

HRO Practice 1: Manage the System, Not the Parts

• Organizations are complex systems open to interaction with their environment

• Exert AND receive influence

• System accidents are rooted in complex interactions between engineered components and human operators

• HROs understand and master their systems

• Predictable, reliable, and warning signs of trouble are quickly detected

• Map systems to understand dependencies and interactions between processes and the space between processes

• Understand safety constraints, including how they may be violated

• Design detection features to recognize violations34

Tolk & Hartley (2011)© Janice N. Tolk 2016. All Rights Reserved.

HRO Practice 2, Reduce System Variability

• Three sources of information – likely already present

• Internal reporting system

• Systemic analysis of accidents and events

• Systemic analysis of accidents in the industry

• Feedback

• Understand if corrective actions address underlying causes of accidents AND prevent recurrence

• Communicate what needs to be reported

• Culture must support honest reporting

• Don’t be lulled into complacency by consistent good safety performance!

• Industrial safety metrics are important, but are not an indicator of system safety35

Tolk & Hartley (2011)© Janice N. Tolk 2016. All Rights Reserved.

HRO Practice 3 – Foster a Strong Culture of Reliability

• Culture of Reliability - Safety is the highest priority and is essential for long term success.

• Encourage questioning and reporting anomalies

• Major system accidents are often traced to failures in the safety management system

• Strong, sustainable, and the primary focus for all activities

• Safety performance is equal priority to production performance

• Don’t normalize deviance

• Be alert to practical drift

• Leaders exhibit safety as a priority

• Accidents are typically the result of system failures, not human failures36

Tolk & Hartley (2011)© Janice N. Tolk 2016. All Rights Reserved.

HRO Practice 4, Learn and Adapt as an Organization

• HROs are learning organizations

• Continuous flow of information

• Information is not hoarded

• Effective reporting mechanisms

• Data mining for trend analysis

• Understand the difference between Work-as-Imagined and Work-as-Done

37

Tolk & Hartley (2011)© Janice N. Tolk 2016. All Rights Reserved.

HRO - Safety Culture Plus So Much More

• Safety culture is one aspect of highly reliable operations.

• Arguably the largest aspect, but not the only one

• Organizations need to understand their systems

• Design, interactions, ACTUAL operations, variations

• Organizations need to use their data as feedback to find trends and opportunities

• Organizations need to be continually learning from past and present operations, incidents, anomalies, and successes

• All four practices are essential to gaining understanding and even a modicum of control of complex, tightly coupled, and interactive operating systems

• Measure safety culture, but don’t stop there!38

© Janice N. Tolk 2016. All Rights Reserved.

Getting Specific• Develop an HRO oriented end state definition specific to your company

• Example: An organization that repeatedly accomplishes its high hazard mission while avoiding catastrophic events, despite significant hazards, dynamic tasks, time constraints, and complex technologies.

• Develop an operational definition

• Example: Operators take a systems approach to performing work and all subsystems and interfaces are understood and optimized in context of the operating system. Systems are designed to prevent errors as well as to be resilient.

• Determine the worst thing that can happen to your company

• Pinnacle event – an event that results in major consequences that could stop operations for an indeterminate period of time.

• Plateau event – an event that indicates loss of control of the system, process, or operation and results in loss of confidence of management, stakeholders, and shareholders.

39

B&W Pantex (2011) © Janice N. Tolk 2016. All Rights Reserved.

Getting Specific (continued)

• A vision is essential and must be communicated

• Strategies and objectives must be developed

• The end state must be defined

• There must be a basis for a gap analysis

• Everyone must understand why this is important

• Explicitly define the worst thing that could happen

40

B&W Pantex (2011)© Janice N. Tolk 2016. All Rights Reserved.

Challenges & Roadblocks to Success

• Impatience

• Change is slow and culture change is really slow

• Loss of senior manager champion

• Explaining to employees that this isn’t another “flavor of the month”

• HRO is an operating paradigm that encompasses many programs in place and provides consistent basis for system design and decision making

• Without an initiating event, employees may not see the need or feel any urgency

• Systems thinking doesn’t come easily to everyone

41

© Janice N. Tolk 2016. All Rights Reserved.

How to Determine Readiness

• The organization demonstrates most of the characteristics of an HRO

• Operating systems are designed, mature, and stable

• There has been no catastrophic event (you aren’t in a panic!)

• The organization is willing to devote resources to understand the underlying causes of low consequence, information rich events

• The organization is willing to take strong action to weak signals

• The organization is willing to change

42

© Janice N. Tolk 2016. All Rights Reserved.

What Can You Expect from the HRO Journey?

• Focus on the most important systems and processes

• Decreases the gap between work-as-imagined and work-as-done

• Help everyone understand their role in the bigger system

• Increased Value to Customers and Regulators

• Increased Employee Involvement & Buy-in

• Positive atmosphere where employees report errors

• Empowerment

• Framework to understand overall context

• Ability to challenge

• Responsibility to engage43

B&W Pantex (2011)

© Janice N. Tolk 2016. All Rights Reserved.

Lessons Learned from an HRO Journey

• Recruit a strong senior manager champion

• Take a systemic approach

• Derive an operational definition for YOUR operation

• Pinnacle Event

• Plateau Event

• Develop a lexicon

• Baseline your organizational culture

44

Tolk (2013)© Janice N. Tolk 2016. All Rights Reserved.

Lessons Learned from HRO Journey

• Perform a gap analysis against the operational definition and safety culture attributes

• Make HRO transformation part of the strategic plan

• Manage the transformation like a project

• Create a toolkit and use it often

• Measure, trend, and feedback results

45

Tolk (2013) © Janice N. Tolk 2016. All Rights Reserved.

Lessons Learned from Safety Culture Survey• Understand the concept of safety culture

• Understand what culture means in the context of your business

• Know what you are going to do with the results

• Use an expert to design & administer the survey

• Designing an unbiased survey is difficult and time consuming

• Interpreting a survey is difficult

• Survey takers are more comfortable (anonymous)

• Be patient

• Designing, administering, and analyzing a survey takes time

• Change takes time

• Be strong

• Don’t ask a question if you can’t stand the answer46

B&W Pantex (2011) © Janice N. Tolk 2016. All Rights Reserved.

Who Can We Benchmark?

• Most practitioners are in health care

• Little (if any) research connecting HRO operations with tangible improvement in safety, productivity, and security

• Many companies report better performance overall, but cause and effect is elusive

• Difficult to measure – hard to prove – but worth it!

47

© Janice N. Tolk 2016. All Rights Reserved.

References• B&W Pantex (2011). Presentation to Statoil.

• Deming, W. E. (1994). The New Economics for Industry, Government, Education, 2nd Edition. The MIT Press.

• Federal Aviation Administration (FAA). (2008). The Safety Culture Indicator Scale Measurement System (SCISMS) (DTFA 01-G-015). Atlantic City International Airport, NH: von Thaden. T. L. & Gibbons, A. M.

• International Atomic Energy Agency (IAEA). 2002. Safety culture in nuclear installation: Guidance for use in the enhancement of safety culture. (IAEA TECDOC-1329). Vienna, Austria.

• Perrow. C. (1999). Normal Accidents, Living with High-risk Technologies. Princeton University Press. Princeton: NJ.

• Roberts, K. H. (1989). New challenges in organizational research: high reliability organizations. Organization & Environment, 3(2), 111-125.

• Roberts, K. H., & Rousseau, D. M. (1989). Research in nearly failure free, high reliability organizations: having the bubble. IEEE Transactions on Engineering Management, 36(2), 132-139.

• Rochlin, G.I. (1996). Reliable organizations: Present research and future directions. Journal of Contingencies and Crisis Management, 2(4), 55-59.

• Sagan, S. D. (1993). The Limits of Safety. Princeton, NJ: Princeton University Press.

• Schein, E. H. (2010). Organizational Culture and Leadership, Fourth Edition, San Francisco, CA: Jossey-Bass

• Tolk, J. N. (2013). HRO 101: Introduction to High Reliability. Presented at HRO International Conference, Midland MI.

• Tolk, J. N. & Hartley, R. S. (2011). The Journey to Become a High Reliability Organization. Presented at American Society of Engineering Management International Conference.

• U.S. Department of Energy. (2008). High Reliability Operations: A Practical Guide to Avoid the System Accident. Washington, DC: Hartley, R. S., Tolk, J. N. & Swaim, D. J.

• Weick, K. E. (1989). Mental models of high reliability systems. Industrial Crisis Quarterly, 3, 127-142.

• Weick, K. E. & Sutcliffe, K. M. (2001). Managing the Unexpected: Assuring High Performance in an Age of Complexity. San Francisco, CA: Jossey-Bass, A Wiley Company

• Weigmann, D. A., von Thaden, T. L., & Gibbons, A. M. (2007). A review of safety culture theory and its potential application to traffic safety. Institute of Aviation Human Factors Division, University of Illinois at Urbana-Champaign.

48

© Janice N. Tolk 2016. All Rights Reserved.