accident case of nasa and the space shuttle columbia … sponsored documents... · 2017-07-31 ·...

29
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=fast20 Download by: [73.172.96.143] Date: 27 July 2017, At: 10:15 Astropolitics The International Journal of Space Politics & Policy ISSN: 1477-7622 (Print) 1557-2943 (Online) Journal homepage: http://www.tandfonline.com/loi/fast20 Evaluating Decision-Making Modalities and Risk Acceptance Behavior after a Major Mishap: The Case of NASA and the Space Shuttle Columbia Accident David M. Lengyel, Thomas A. Mazzuchi & Duane W. Deal To cite this article: David M. Lengyel, Thomas A. Mazzuchi & Duane W. Deal (2017) Evaluating Decision-Making Modalities and Risk Acceptance Behavior after a Major Mishap: The Case of NASA and the Space Shuttle Columbia Accident, Astropolitics, 15:2, 113-140, DOI: 10.1080/14777622.2017.1340070 To link to this article: http://dx.doi.org/10.1080/14777622.2017.1340070 Published online: 27 Jul 2017. Submit your article to this journal View related articles View Crossmark data

Upload: others

Post on 21-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=fast20

Download by: [73.172.96.143] Date: 27 July 2017, At: 10:15

AstropoliticsThe International Journal of Space Politics & Policy

ISSN: 1477-7622 (Print) 1557-2943 (Online) Journal homepage: http://www.tandfonline.com/loi/fast20

Evaluating Decision-Making Modalities and RiskAcceptance Behavior after a Major Mishap: TheCase of NASA and the Space Shuttle ColumbiaAccident

David M. Lengyel, Thomas A. Mazzuchi & Duane W. Deal

To cite this article: David M. Lengyel, Thomas A. Mazzuchi & Duane W. Deal (2017) EvaluatingDecision-Making Modalities and Risk Acceptance Behavior after a Major Mishap: The Caseof NASA and the Space Shuttle Columbia Accident, Astropolitics, 15:2, 113-140, DOI:10.1080/14777622.2017.1340070

To link to this article: http://dx.doi.org/10.1080/14777622.2017.1340070

Published online: 27 Jul 2017.

Submit your article to this journal

View related articles

View Crossmark data

Page 2: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

RESEARCH ARTICLE

Evaluating Decision-Making Modalities and RiskAcceptance Behavior after a Major Mishap: The Case ofNASA and the Space Shuttle Columbia AccidentDavid M. Lengyela, Thomas A. Mazzuchia, and Duane W. Dealb

aThe George Washington University, Washington, DC, USA; bExtra Insights, LLC, Colorado Springs,Colorado, USA

ABSTRACTThe February 2003 loss of the space shuttle Columbia onmission STS-107 was a mishap that stunned both NASA andthe world. This research examines the pre-NASA and post-NASA decision-making modalities and risk acceptance beha-vior for the safe and reliable operation of the space shuttlethrough the lens of hazard analysis. Interviews with NASAadministrators and senior Space Shuttle Program managersbring back to life their views from the 2003 through 2005timeframe, during which NASA returned the space shuttleback to a flight status. Lessons from their effort have broadapplicability to other organizations recovering from—andattempting to prevent—a major accident.

The space shuttle Columbia, on Space Transportation System (STS) missionSTS-107, was lost during reentry on Saturday morning, 1 February 2003, killingsix National Aeronautics and Space Administration (NASA) astronauts and oneIsraeli Astronaut. The ultimate cause of the accident was the shedding of a 1.3-pound piece of foam insulation from the shuttle external tank bi-pod area82 seconds into ascent; the collision between this piece of foam and the shuttleorbiter caused a breach in a left wing panel. During reentry, hot gases enteredthis breach and destroyed the left wing, followed by the breakup of the vehicleand loss of the crew. On 26 August 2003, the Columbia Accident InvestigationBoard (CAIB) delivered its summary report to the president, Congress, andNASA. The report contained 29 findings and recommendations; 15 of thesewere deemed critical enough that the CAIB suggested that the space shuttle notbe returned to flight until they had been resolved.

Chapter 6 of the CAIB report is dedicated, in its entirety, to “DecisionMaking at NASA” and follows the historical trail of decisions leading up tothe decision to launch Columbia as well as the Mission Management Team

CONTACT David M. Lengyel [email protected] The George Washington University, 2121 I St. NW,Washington, DC 20052, USA.Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/fast.The analysis and opinions expressed in this article are those of the authors and not of NASA or the U.S.government.

ASTROPOLITICS2017, VOL. 15, NO. 2, 113–140https://doi.org/10.1080/14777622.2017.1340070

© 2017 Taylor & Francis

Page 3: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

(MMT) decisions and missed opportunities to address the foam debrisdamage during the flight. Regarding risk acceptance, the CAIB noted thefollowing:

When a program agrees to spend less money or accelerate a schedule beyond whatthe engineers and program managers think is reasonable, a small amount of overallrisk is added. These little pieces of risk add up until managers are no longer awareof the total program risk, and are, in fact, gambling. Little by little, NASA wasaccepting more and more risk in order to stay on schedule.1

Engineering management issues exposed by the CAIB included the fact thatNASA accepted more residual risk over time by not treating foam-sheddingevents as a safety-of-flight threat. A member of NASA’s Aerospace SafetyAdvisory Panel (ASAP) commented to the NASA administrator after attendingthe STS-107 Flight Readiness Review (FRR) that, in his opinion, the programwas operating with a false sense of security and complacency about risk. He wenton to opine that schedule and cost were compromising decisions, and that theprogram was not appreciating and evaluating major hazards as it should.

This salient observation, made before the launch of Columbia, wasconfirmed by the CAIB during its investigation. Residual risk acceptancedecisions, particularly in human spaceflight, are never a trivial matter, butcost and schedule constraints can creep into and play a role in thesedecisions. During the interviews, conducted on a non-attribution basis, aformer senior Space Shuttle Program employee explained that it is difficultto assess risk in very complex space systems; there are many unknowns,requiring experience as a guide. Furthermore, there is never enoughfunding and time to perform every analysis that the program may need.There is a point in time when you decide whether you have done enough.Columbia was an example of how management decision making andresidual risk acceptance were less than adequate to ensure safe and reliableoperations for the space shuttle.

Major mishaps, such as the Columbia tragedy, provide an opportunity foran organization to reflect on, derive lessons from, and learn from failure. Thefocus of this article is on the two-part question: how does a major mishapserve to revise management decision-making modalities and residual riskacceptance behavior; and how can a hazard analysis approach be used toguide organizational change? The events of the Columbia accident andNASA’s return-to-flight activities provide insight into answers on thesequestions. The perspectives of former senior managers from NASA, as wellas NASA’s Aerospace Safety Advisory Panel, served as the building blocks forthis research. This article integrates the appropriate literature to includehazard analysis and decision making under risk and uncertainty. Theresearch approach and methodology for analysis are explained, and researchresults and conclusions are addressed.

114 D. M. LENGYEL ET AL.

Page 4: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Integrating the literature

The CAIB documented organizational deficiencies at NASA, and specificallyindicting decision-making modalities and residual risk acceptance. Afterconducting several interviews for this article’s research, a set of distinctpatterns began to emerge between the CAIB findings and the body of knowl-edge found in hazard analysis and decision making under risk literature. Astructural model of these relationships then guided the construction of amethodology to examine post-mishap management actions, which drovechanges in decision-making modalities and residual risk acceptance behavior.

Hazard analysis

Hazard analysis is simply an organized methodology for identifying, classify-ing, and developing controls for hazards associated with a system to managethe risk. It is a deductive process used to analyze nearly every aerospacesystem and sub-system. The reason that it is proposed here is that NASAused this hazard analysis methodology in its flight rationale summary pre-sentation to the Return-to-Flight Task Group before STS-114.2 The presenta-tion covered the technical/engineering topics related to elimination of criticaldebris, impact detection during ascent, on-orbit debris impact/damage detec-tion, on-orbit thermal protection system repair, and crew rescue. Nothandled in this safety case were hazards associated with the managementsystem, which played a critical role in the Columbia mishap in the form ofless-than-adequate decision making and residual risk acceptance. The intenthere is to use this hazard analysis approach for that domain of CAIBrecommendations.

The following Johnson Space Center (JSC) Health and Safety Handbookexcerpt provides guidance for the process of hazard analysis.

You shall use these steps to decide what corrective action to take for any hazardfound during your analysis. Take the following actions in the order below tocontrol a hazard. Go to the next step only if the present step or previous stepsaren’t feasible or are too costly:

Change the design to eliminate or reduce the hazardInstall safety devices or guardsInstall caution and warning devicesUse administrative controlsUse personal protective equipmentAccept the riskMake sure that all hazards are controlled. To do this, you shall track each

hazard and keep it “open” until one of the above actions has occurred.3

Hazards are classified and prioritized using a 4×5 matrix and associatedqualitative scoring scheme. Conditions are classified as: Class I—Catastrophic; Class II—Critical; Class III—Moderate; and Class IV—

ASTROPOLITICS 115

Page 5: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Negligible. The likelihood scoring scheme is also qualitative with five levels:(A) Likely to occur; (B) Probably will occur; (C) May occur; (D) Unlikely tooccur; and € Improbable.4 Table 1 delineates the risk assessment codes(RAC) for each class of hazards.

An assumption is that management systems, including decision-makingmodalities, if not designed, vetted, and validated, may be the source of Class Ihazards with varying degrees of RAC scoring. Even adequate managementsystems can fail, due to inherent design flaws. One need only note theChallenger space shuttle launch decision,6 or the series of decisions leadingup to the British Petroleum Deepwater Horizon accident, Exxon Valdez oilspill, Fukushima nuclear reactor, and 2008 mortgage derivatives crisis to seethe aftermath of poor risk decisions.7

Decision making under risk

Modern theories of decision making under risk have been the subject ofextensive research over the years, with Von Neumann and Morgenstern(1944), Kahneman and Tversky (1979), and Keeny and Raiffa (1993)8 asthought leaders, to name but a few. Rather than immerse in the specifics oftheir work, much of which is based on static lab-based games or lotteries, it ismore useful to examine this literature through the lens of modern-dayapplications to engineering and safety decisions.

Prominent in the engineering and operations domains of decision makingare a common set of decision attributes (cost, schedule, technical, and safety),which are tied to program requirements and the uncertainties associated withthem. These attributes are also aligned with continuous risk managementscoring practices, which require decision makers to assign likelihoods (prob-abilities) and consequences to each risk, should it occur. The linkage betweenrisk and decision analysis is described as a companionship.9 The decisions inquestion related to the STS-107 mishap were primarily, but not limited to:STS-107 launch decision without resolution of the bi-pod foam incident twomissions earlier; decision not to request on-orbit imagery to assess orbiterColumbia damage; decision(s) to ignore lower-level engineers’ concerns to

Table 1. Hazard Analysis (Risk Assessment Code) Matrix.Likelihood Estimate

ConsequenceClass A B C D E

I 1 1 2 3 4II 1 2 3 4 5III 2 3 4 5 6IV 3 4 5 6 7

Notes: The codes, 1–7, indicate the risk level and the top-level actions which must be implemented for thathazard. RAC level 1 is unacceptable, RAC 2 is undesirable, and RACs 3–7 are acceptable with controls.5

116 D. M. LENGYEL ET AL.

Page 6: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

obtain imagery for damage assessment purposes; and lack of a decision toinspect the leading edge of the wing by a space walk to assess orbiter damage.Clearly obvious here is a break between decision making and adequate riskanalysis. Without the latter, a false sense of risk acceptance rationale (thefoam striking the orbiter considered as only a maintenance turnaround issue)developed.

NASA’s Management Oversight and Risk Tree (MORT) was designed inthe early 1970s as an analytical approach to examine mishap causes andevaluate any safety program. MORT acknowledges that program managersmust assume certain risks that are “attendant to the design, manufacture,test, and operation of the hardware system to effectively accomplish themission for which the system was developed.” This acceptance, however,requires full visibility into the nature of hazards and risks that are inexistence and the options and alternatives to the acceptance of the risks.10

In this instance, NASA managers clearly did not understand the nature of thehazards and risks related to reentry of an orbiter with a damaged wing.

Research approach and methodology

Overview

For this research, the dimensions of comparison are three distinct andhierarchically arranged decision-making levels, allowing a multi-case studyapproach. These levels are: (1) the Space Operations Mission Directorate(SOMD) at NASA headquarters; (2) Space Shuttle Program at NASA’sJohnson Space Center; and (3) MMT—a multi-center forum for real-timeoperations support. Examining these entities allows a comparative analysisbefore and after the mishap and across multiple levels of managementdecision making.

Data collection

A key goal of data collection is to discover a set of core variables, whichaccount for most of the variation in how a mishap serves as a forcingfunction for changes in decision-making modalities and the role of riskattitudes. Data collection for this case study used structured interviews (seethe following questions), combined with an extensive collection of secondarydata.11 Data collection was integrated with analysis and theoretical samplingin a series of iterations to develop a set of hazards and controls, which attendto the research question. The timeframe of the research setting is 2002through 2005.

ASTROPOLITICS 117

Page 7: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

● What do you consider to be the major changes in management decision-making approaches/methodologies before and after the Columbiaaccident?

● Was there overall agreement on return-to-flight integrated programma-tic and technical goals and objectives and metrics to measure progress tomake risk-informed decisions by management?

● What was your perception of the weighting of key decision criteria (e.g.,cost, schedule, technical, safety, etc.) used to make closure or acceptancedecisions on return-to-flight risks? How did these risks shape the deci-sion-making processes for return-to-flight?

● Using a scale from 1 to 10, with 1 being risk averse, 5 being risk neutral,and 10 being risk seeking, what was your perception of management’srisk appetite before Columbia and during return-to-flight leading up toSTS-114?

● Were you confident those decisions to close or accept risks were wellcharacterized and acceptable before STS-114 by management? Was thiscommunicated effectively? (Internally/externally)

● What decision analysis tools probabilistic risk analysis (PRA), integratedhazard analysis, fault trees, fishbone diagrams, decision trees) were usedto facilitate overall decision understanding by management?

● What were the strengths and weaknesses of the decision-makingapproaches used by management? If you were “King for the Day,” howwould you have improved the process?

● Did the Federal Advisory Committee known as the Stafford-CoveyReturn-to-Flight Task Group influence decision making by senior man-agement in any way? Short term? Long Term?

● What were your top-three take-aways in the area of group decisionmaking before and after Columbia mishap?

Data

The difficulty in capturing primary data in the form of interviews cannot beunderstated—this was a primary factor in the selection of NASA’s Columbiaaccident as a critical case. Some individuals are extremely sensitive, if notreticent, to provide an interview, which deals with organizational shortcom-ings after a major accident that involves loss of life. Reasons for selecting theColumbia mishap as the research setting included: access to key decisionmakers at NASA; the mishap allowed examination and the extension ofhazard analysis and decision making under risk theories; and the accidentresulted in part from less-than-adequate decision making before and duringthe mission.

118 D. M. LENGYEL ET AL.

Page 8: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Structured, non-attributable, telephonic interviews of senior NASA managersand independent oversight leadership (Aerospace Safety Advisory Panel andReturn-to-Flight Task Group) served as primary data for this study.12 Interviewswere recorded and transcribed. Transcripts were coded to operationalize vari-ables and relate hazard analysis concepts. A purposive approach was necessaryin the selection of interviewees—senior leadership in the Space Shuttle Program(SSP) and at NASA headquarters was essential for capturing this information.An element of snowball sampling was necessary to pick up additional inputsand fill gaps suggested by interviewees or determined from analysis gaps.13

Secondary data were used to provide context and justify assumptions, andincluded, for example, the CAIB report, the NASA Return-to-FlightImplementation Plan, the Return-to-Flight Task Group report, NASA proce-dural requirements, well-known safety literature and peer-reviewed articles.

Dependent variables

The dependent variables are the post-mishap changes, described qualitatively,in decision modalities and risk acceptance behavior at three level managementslevels: (1) Strategic; (2) Program; and (3) Operational (risk acceptance beha-vior). The dependent variables are the outputs from the hazard analysis process.Risk acceptance, or risk appetite, as it is often referred to, is defined as anorganizational trait that can be characterized by being risk averse (conserva-tive), on one end of the spectrum, to willing to accept risk, on the other.14

Independent variables

The independent variables are the qualitative factors that impact change indecision-making modalities and residual risk acceptance behavior. These vari-ables are the individual, group, and organizational factors that cause change inthe organization post-mishap. A small set of key variables was drawn from thislist to explain the largest variations in the dependent variables. In the hazardreduction framework, we considered these variables as “conditioning” or “initi-ating” events that needed to occur before the framework could be successfullyapplied. When translated into the hazard reduction framework, these variableswere determined to be: whether it is a major mishap versus a close call event;whether the mishap is an existential threat to the program that may causeprogram cancellation; whether independent mishap board report findings andrecommendations find that decision making is a proximate cause of the mishap;whether the mishap is a significant emotional event for those involved; theextent that senior leadership committed to making changes; and the extent towhich the organization internalizes the findings of mishap report.

ASTROPOLITICS 119

Page 9: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Limitations

Given that memories of individuals are fallible, retrospective analysis from2003 (12 to 14 years after the period being studied) is subject to severalpotential limitations in the primary data stemming from memory biases,such as confirmation bias, hindsight bias, and change bias.15 Having saidthat, interview data were carefully coded, looking for inconsistencies, andchecked for accuracy against secondary data.

Results

Data and methods

A substantive coding list was matured over the course of the interviews. Asystematic and iterative process of interviewing, coding, and memo writingwas established to develop a hazard reduction framework. These changeswere then compared at the three levels of management.16 After 15 interviewswere conducted, there was a pattern of repetition in factors related to changesin decision making and residual risk acceptance. At that point, there was asense that theoretical saturation was reached, and that further interviewswould not yield any additional information.

Comparisons of interview results

Some of the most compelling statements during the interview process areprovided in the following and are organized by the levels of management,beginning with NASA headquarters Space Operations Mission Directorate(SOMD), followed by the Shuttle Program Requirements Control Board(SPRCB) and, finally, the MMT. Space Shuttle managers at NASA’s JohnsonSpace Center led the latter two forums, with participation from primarily NASAheadquarters, the NASA Marshall Space Flight Center (MSFC), the NASAKennedy Space Center, as well as NASA contractors. They are further dividedinto the periods before the mishap and after the mishap to examine changes.The excerpted comments provide insight into the unedited thoughts of theinterviewees. Exact comments are bounded by quotation marks, though theinterviewees were promised anonymity and citations are therefore not provided.

Space Operations Mission Directorate: Before mishap

SOMD was not as engaged in decision making on a day-to-day basis beforethe accident. Its members kept up to date (with the program) by trackingissues, but the program was pretty much on its own to make decisionsautonomously. Broadly speaking, decisions regarding the shuttle systemwere made within the shuttle program and they were or were not, depending

120 D. M. LENGYEL ET AL.

Page 10: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

on the case, communicated upward to more senior management. As onemanager explained, “Writ large, there was not enough technical prowess at(NASA) HQ [headquarters] to take on sometimes arrogant field organiza-tions that say we know about physics and you don’t.”

One interviewee commented, “My view of the ‘before’ was lots of motion,lots of action, but very little movement toward safe and reliable operations.Technical wasn’t king, schedule was the dominant factor, and programmaticcost and schedule prevailed.” Perhaps exacerbating these issues was the factthat, unless something was identified as a really big risk, which had to go tothe NASA administrator for adjudication, NASA tended to accept risk bycommittee, rather than responsible individual.

A senior NASA manager noted, “By the time of the Columbia mishap, ithad been 17 years since the Challenger accident, and several generations ofmanagers had come and gone. There had not been an overtly threateningincident, meaning one that people recognized as threatening. There weresome threatening incidents that the CAIB identified in retrospect, but peopledidn’t notice them at the time. People’s view was, I believe, that we generallyunderstand the shuttle, we know how to fly it, and the shuttle was opera-tional in their view.”

Space Operations Mission Directorate: After mishap

After Columbia, critical program decisions taken in isolation by the programwere no longer the case. These types of decisions went to the highest levels. Asenior administrator remarked, “I insisted that I was going to be a part of anysignificant decision on shuttle, because we had demonstrated that if some-thing went wrong on shuttle, it wasn’t just the shuttle program manager whowas going to be in trouble, it was the entire agency. If you are going to riskthe agency, then agency-level management is going to be involved in thosedecisions.”

Per the CAIB recommendation on independent technical authority(ITA), NASA implemented this concept agency-wide after the Columbiaaccident. ITA was intended to modify decision making and give technicalinputs to decisions at the same, if not more weight, as cost and scheduledecision attributes. As one manager explained, “The technical authoritykept evaluating the status of technical knowns and unknowns and whenwe finally got to a point where a calculation on the risk was possible andthe overall risk was deemed acceptable, and knowing how many shuttleflights were left, we accepted the risk and finished out the Space ShuttleProgram. But we flew with many unknown technical issues. If this hadbeen a new program, it would not have been accepted as safe.” Anothermanager noted that, “I think that after Columbia, while you wouldn’twant to say that cost was no object, because money was not unlimited, but

ASTROPOLITICS 121

Page 11: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

on a practical basis the cost of getting the shuttle program fixed was alower priority than doing the right thing, if that could be determined. Itcan be hard to know what the right thing to do is, but if we could figure itout, we did it, and the cost was what it was. Schedule took a distant backseat.”

NASA also tried to temper the pace of decision making and schedulepressures with the idea that they were not going to make less-than-adequatedecisions for schedule only. But, at the same time, there remained short-termand longer-term schedule pressures to finish the space station. A seniorofficial commented that, before he took over, “President Bush and hisadvisors decided that the shuttle would be retired by 2010. I’ll say 2010 orso, because while it might have been politically a very specific date, techni-cally it wasn’t—the shuttle didn’t fall apart on January 1st of 2011.Ultimately, we did retire it in 2011. The point that I am trying to make isthat I had that date out in front of me, and yet we also had the task assignedby the president, and the Congress as well, and which I strongly supported,that we would finish the space station.”

In the aftermath of the CAIB report submission in August 2003, then-NASA Administrator Sean O’Keefe chartered an executive team led byNASA’s Goddard Space Flight Center Director Alphonso Diaz. The teamwas asked to dissect the CAIB findings and recommendations and study theirapplicability to the entire agency.17 Changes were categorized as follows:leadership, learning, communication, processes and rule, technical capabil-ities, organizational structure, and risk management.

In the domain of decision making, the executive team emphasizedknowledge-based approaches, facilitating effective communication,increasing the level of technical expertise brought to decisions, operatingwithin the organization’s rules, feedback systems in channels of decisionmaking such that the workforce is aware of the status and rationale ofdecisions, strengthening the safety and mission assurance organization toeffectively participate in decisions, and implementation of independenttechnical authority which employs systematic checks and balances.18

Further, the team called for the following in the domain of risk manage-ment: integrating decision-making and risk management processes,establishing independent risk assessments, taking a systematic view ofrisk, implementing uniform risk management standards and methodolo-gies across all programs, and benchmarking of other organizations forvaluable lessons learned.19 In the years following the Columbia accident,the Diaz report and its implementation was of high priority across theagency. Significant changes were made in the Space Shuttle Program,which then successfully flew 22 more missions in a safe and reliablemanner before the remaining fleet of three Orbiters was retired in 2011.

122 D. M. LENGYEL ET AL.

Page 12: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Program Requirements Control Board: Before mishap

The Shuttle PRCB managed both engineering and mission requirements atthe program level and was chaired by the SSP program manager.Interviewees offered their views on decision making and risk acceptancebehavior in this forum. One manager noted that, “Before the accident,there was a consistent belief that drove everything, that America mustcontinue flying, and NASA must continue to fly. The objective was to getthe space station completed, which would build America’s confidence inNASA to enable future programs.” Another NASA official bluntly statedthat, “There was flawed decision making and there needed to be changesmade to improve decision making.” Commenting on risks, another managernoted that, “Everybody in the program knew there were risks, and we had tobe careful, and we had to be as safe as we could, but cost and schedule wereparamount.” This comment was further explained by a former NASAemployee: “We didn’t have a sufficiently balancing force to give considera-tion to risk acceptance and to make sure that other voices were heard.”

One former program official noted bluntly: “Before the accident, I wouldhave to say that the decision process was largely about making sure thateveryone was protected. That for every issue we found somebody to say itwas OK. The program manager could say, for this issue this person said itwas OK, for this issue this person said it was OK, and so on. Here was alegalized process way of saying it’s OK. It was all about building evidence thatsaid what we are doing makes sense—what we are doing is defensible. Thatwas the major mode of operation of the Shuttle Program beforehand. Is whatwe are doing defensible? Not is it right, not is it safe—is it defensible? So wehave a set of evidence that says it’s defensible.”

Less than adequate communication tends to resurface as an issue in themajority of mishap reports. Columbia was no exception, with one formermanager stating that, “There wasn’t enough knowledge sharing. Orbitermade Orbiter decisions, integration made integration decisions, and MSFCmade MSFC decisions. The biggest issue was the stove piping so people can’tsee back and forth across the program, and the sharing of knowledgebetween the projects was poor.”

Program Requirements Control Board: After mishap

The implementation of the ITA management structure and attendant pro-cesses was perhaps the biggest instrument for change in decision making andrisk acceptance. ITA provided a structure that employed checks and balancesbetween program and technical (engineering, safety, and medical) authoritiesand was designed to ensure that decisions had the benefit of those differentpoints of view and that they were not made in isolation. It also provided a

ASTROPOLITICS 123

Page 13: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

means for handling dissenting opinions. As a former ASAP member pointedout, “When the program manager held tremendous power and authority andthe money, there was little check and balance.” A former senior SSP managersummed it up by stating, “Program managers are responsible for achievingobjectives within cost and schedule, and technical management sometimesfalls victim to those pressures. The functional or engineering side of theorganization is not responsible for cost and schedule, and that, too, can be adanger for other reasons. They tend to elevate the technical concerns morehighly. When both sides have equal organizational authority, the properbalance of decision making is more easily obtained.”

As with any organizational change effort, the implementation process wasdifficult and strained the system, although necessary. Another NASA officialnoted that, “In the days before Columbia, a program manager could get awaywith overriding safety if he did certain things. After Columbia, it was madevery clear to us if one of those independent authorities objected, we weregoing nowhere until we got their concurrence by changing what we weredoing, or we had to appeal all the way to the administrator.” Anothermanager noted, “Organizational change is extraordinarily difficult. It ismade a lot easier when you have a significant emotional event and the seniorleadership recognizes that fundamental change has to be made. That makes iteasier but it is not easy. It is still very hard for people to change.” Part of thechange that was obvious to observers or NASA was the change out of thesenior managers on the shuttle program. As one NASA manager recalled,“We had a clean sweep—everybody at the top level was gone. I think that wasintentional and maybe necessary.”

How did ITA impact decision making? One HQ official recalled that “I putin place a system where in order to make a really bad decision, you had tohave a double fault. You had to have a failure on the engineering, orfunctional side of the chain of command, and you had to have a failure onthe programmatic side of the chain of command, before a really bad decisionwould get made.” A former program-level manager concurred with thisassessment, stating, “We set this system up that said OK, we want theprogram manager to have these independent technical authorities that willkeep the program manager from making a stupid decision. We want to stopthem taking on too much risk.” A NASA astronaut stated, “The programmanager should not take on uncontrolled residual risk unless engineeringsays it meets technical requirements, safety says it’s OK, and the risk taker(the astronaut) agrees to take the risk. That concept was always there, but itwas never formally documented until only a few years ago.”

Responding to the subject of cultural changes at the program level, onemanager recalled that, “The ‘getting it right culture’ and organizational change,proving to the up and out folks that we had addressed all these issues, was vital.You won’t find this written down, but it did change the decision-making

124 D. M. LENGYEL ET AL.

Page 14: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

processes. There was no more discussion of saving nickels and dimes aroundthe program by assuming the system was mature and we understood all therisks.” The cultural changes were not without some turmoil, however, andanother manager opined that, “I had never in my time at NASA seen asmuch attention paid to people that could not join the consensus. We gavethem time and listened to them and elevated their concerns to the next levelsand so on. The process slowed us down in some ways, and I’m sure it frustratedthe program manager. It made the program look less decisive, took more time,energy, and sometimes money and schedule. In the end, it made for quite agood informed decision process.”

Speaking to the system safety process changes, a former official noted that,“After Columbia we became a lot more religious about hazard reports—integrated hazard analysis. We became a lot more serious about quantitativemethods. We became a lot more serious about using the likelihood/conse-quence matrix to express risks.” He went on to comment on risk manage-ment, stating that “there were some rocky parts to the start (of doing formalrisk management) but the understanding evolved over the two-year period—by the time we got to STS-114 in July 2005, it was a really good effort. Weunderstood much more quantitatively where the risks lie, and it was very easyto communicate.” He ended our interview with the following cautionarycomment: “Were the risks characterized? Yes. Were they acceptable? No,but there was no other option. Were they acceptable on an absolute stan-dard? No. Were they acceptable on a ‘We’re in a flight test program, and weare going to continue to be in flight test program, and we are going to have tocontinue to fly and learn?’ Yes.”

In the final calculus of managing risky technology such as the spaceshuttle, a former systems engineer noted that, “There are limits to whatmanagement science can do. There are limits to what good managementcan do. Now, could we have used better quantitative techniques? Yes, weshould have and we did. Should we have used more tests? Should we havebeen more willing to spend more money for tests? Yes, we should have.Should we have been more inclusive? Yes, we did and we were. Did wefundamentally change the calculus? No. It was the same calculus and thesame things motivated people. The fundamental issue was that we decided tocontinue to fly a system that was not robust and could not be made robust.All of the labor that we did only was about dealing with the third decimalpoint. We didn’t really make the system tremendously safer. All we weredoing was tweaking about the edges of risks that we could not mitigate.”

Mission Management Team (Operational): Before mishap

The MMT was a real-time operations support forum consisting of a multi-discipline team of managers representing engineering, integration, safety,

ASTROPOLITICS 125

Page 15: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

flight crew, mission operations, and space and life sciences. Per the MMTguiding document, NSTS 07700, Volume VIII, the “MMT resolves outstand-ing problems outside the responsibility or authority of the Launch and FlightDirectors.” One former shuttle manager commented that, “Before the acci-dent, decision making was autocratic and more guided by policy than data.Second, engineering issues did not receive the attention they should have.Third, the stove piping issue, which combined with mid-level (division-level)management knowledge base and knowing how to make informed questionsof their people.”

Adding to these inadequacies, the MMT also suffered from a lack ofdiscipline in terms of designated attendee participation and lack oftraining for attendees. One manager noted that, “Even though theMMT was a recognized body and was written up in some of the docu-ments, including a list of positions that were to be represented in theMMT, there was no formal training; people would just get assigned. If thedirector of Space and Life Sciences is supposed to be at the MMT andcan’t make it, then he can just assign whomever he wanted to in hisorganization to come. They may not know what an MMT is, they maynot know anything about human space flight, but they are going torepresent that organization.”

While the CAIB noted cultural challenges at NASA that contributed to theColumbia mishap, it took a combination of internal soul searching, nudgedby external advisory committees, to put NASA on a path to address theseissues. A former program manager commented, “We had a number of seniormanagers that had been brought up through the NASA human space flightsystem that had very hierarchical points of view, and we knew that we neededto change that culture. We had to change it, in fact, so that people were moreopen to dissenting opinions, would allow weak signals, as they were called, tobe heard and evaluated properly, which I think we were notorious forsquashing weak signals in many instances as an organization before. At thesame time we had to build, in the Mission Management Team setting, thisnew culture that demonstrated to our outside reviewers that we were makingthe changes.”

Mission Management Team (Operational): After mishap

In the aftermath of Columbia, a senior official summed up the initial frame-work for fixing MMT issues with the statement, “We literally brainstormedeverything that we could think of that would improve the decision makingand retrain the managers. We specifically trained our leadership to try tolisten for these weak signals, and to understand that just because someone isnot presenting very coherently or articulately doesn’t mean that there isn’t aconcern there that we need to address.” This brainstorming resulted in basic

126 D. M. LENGYEL ET AL.

Page 16: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

“how to” conduct a meeting, listening skills, training, simulation, and addres-sing cultural issues.

In terms of changes to group decision-making modalities and decisionforum conduct, the MMT received more attention than SOMD’s space flightleadership forum or the shuttle program PRCB. A manager involved in thistransformation stated, “One of the things that we did early on was we hiredvarious consultants and we started looking for classes that would help us. Weput all of our senior managers through what used to be called a ‘CockpitResource Management’ (CRM), which is training that NASA developed withthe FAA [Federal Aviation Administration] for airline pilots and aircrews. Inthe 1970s or 1980s, there were a number of instances where aircrews weredistracted and flew their airplanes into the ground, or had other accidentsthat were clearly avoidable, but they were not properly coordinated in thecockpit and keeping their eye on critical functions. That evolved into acourse, which I think airline safety has greatly improved from, which wasmodified then for a number of different disciplines, but certainly applied toour business. How do you prevent an error chain from occurring? How tonot be distracted at critical times? So we put all of our managers throughthat. There were a number of criticisms on presentations and the way that itwas presented. We had Dr. Tuffte come and do his course on presentationmaterials; it did point out that some of the things we were doing were prettyhorrific in the way that presentations were put together. It encouraged peopleto do more ‘white-papers’ than PowerPoint presentations. We had a beha-vioral sciences organization come in and sit in our meetings and watch ourbehaviors and then give us feedback at the end of the meetings, critiques oninterpersonal relationships and how we did. We all felt like we had gone backto grammar school where the teacher was, you know, you had to say ‘pleaseand thank you’ and be polite. Some of the old-line managers had a realproblem with that attitude change because, coming from an almost military,hierarchical, we’re ‘all about the facts’ and ‘we don’t suffer fools gladly’ typeof organization, it was difficult to sit and listen. We also did a formalcertification process.”

Training and simulation were an important element in addressing thehazards associated with inadequate performance of the MMT. A seniorNASA manager observed, “We developed a plan to train people where theyhad to actually read some material, they had to observe some MMTs doingwhat we call ‘on-the-job training,’ they had to take some classwork to beprepared, and then they had to be formally certified. So it wasn’t just that youhad to have a representative for these organizations; you had to have acertified, trained representative from these organizations so that peopleunderstood what was going on. Part of the training was this culture changeand that we’re going to speak up, we’re going to poll everybody. We’re notgoing to throw people out of the room. The MMT simulations were very

ASTROPOLITICS 127

Page 17: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

effective; they made us think as a group, they educated the team on the entireoperation and folks that were not typically involved in the mission executionphase of the flight—say people who worked on the main engines or the otherpropulsion elements—became very involved and actually, by virtue of beingtrained, smart, observant managers, provided really some key insights to theoperations team from time-to-time on how to solve problems.”

One MMT manager explained, “We stressed the culture changes of listen-ing to weak signals, having thorough discussions, taking the time that weneeded to take to make a decision, and not make decisions just because theclock said that it was time for the meeting to be over. Our goal was to listento everybody that had a dissenting opinion, ensure that we heard theirdissenting opinion or their concern appropriately, and that we dispositionedit appropriately, and that they understood our disposition. That’s a reason-able goal.”

Comparing the data

During and after the fact-finding period, interview data were carefully com-pared and synthesized across the three management levels and the twodependent variables. Inputs from interviewees were validated when three ormore persons answered substantially the same on a particular topic. Asummary of the data is shown in Table 2.

In general, after the mishap, there was consistency in changes indecision-making modalities and residual risk acceptance behavior at alllevels. ITA and balancing programmatic and technical/safety decisionattributes were consistent across all three levels. As one intervieweeexplained, “The decision-making process became two-fault tolerant withthe implementation of ITA; i.e., two failures would have to occur for a baddecision to be made.” Reduction of organizational stove pipes and knowl-edge sharing was unique to the program-level PRCB at level 2. The mostsignificant changes were at level 3, the MMT, where a great deal of detailwent into improving this forum’s decision-making capabilities throughtraining and simulation, as well as codifying how the organization dealtwith dissenting opinions.

Understandably, the risk acceptance level dropped after the mishap.During the interview sessions, subjects were asked to rate their subjectiverisk acceptance levels before and after the mishap on a scale of one to 10,with one being highly risk averse, five risk neutral, and 10 risk seeking. Anempirical cumulative distribution function (CDF) from the responses wasobtained and is shown in Figure 1. Note the risk acceptance shift to the leftafter the mishap. This subjective analysis provides sufficient data to discern abehavior pattern among NASA managers before and after the mishap, whichwas also expressed linguistically in the interview responses.

128 D. M. LENGYEL ET AL.

Page 18: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Before the Columbia mishap, management believed that they generallyunderstood the risk levels, and that the space shuttle was safe to fly. In a starkcontrast to viewing the space shuttle as an experimental vehicle, at the time ofthe accident, NASA was conducting internal studies to determine if the spaceshuttle could be privatized, wherein the agency would buy “seats” into space asa service. In reality, the residual risk level had increased as a result of recurrentfoam losses. The program had, in a sense, lost its vigilance, its compliance withtechnical requirements, and accepted higher risk as a result. Prior to Columbia,there was no formal risk management process at the program level whereresidual risk discussions and risk trade-off analysis would normally occur.

Emergent hazard analysis model

An interim model was developed early in this project to acknowledgechanges in decision modalities and residual risk acceptance behavior. Thehazard analysis risk reduction framework was selected as the engine ofchange due to its common acceptance as a risk identification, analysis, and

Table 2. Qualitative Data Synthesis and Analysis.Dependent Variables

Management Level Decision Modality Changes Residual Risk Acceptance Changes

SOMD—Level 1 ● Leadership involved more in technicaldecisions

● Independent Technical Authority pol-icy developed

● Decision attribute weighting changes(cost/schedule vs. technical/safety)

● Pushed back on external stakeholderson schedule issues

● Became more aware of known riskswhich were accepted

● Funding made available to shuttleprogram to obtain test data to reduceresidual risks

SSP PRCB—Level 2b ● Senior management changed● Removed autocratic approach to

decision making● Independent Technical Authority

implemented● Decision attribute weighting changes

(cost/schedule vs. technical/safety)● Reduced stove piping of project

information and increased knowledgesharing

● Characterized by hyper-conservatismin risk acceptance

● Concern that another shuttle mishapwould kill the program as well as thespace station program

MMT—Level 3 ● Independent Technical Authorityimplemented balancing programma-tic and technical decisions

● MMT Academic and SimulationTraining conducted

● Consulting firm utilized to improvedecision-making performance

● Dissenting opinions welcomed● MMT procedures modified

● Became more conservative towardrisk acceptance

● Relied on Independent TechnicalAuthorities (ITA) to assist in quantify-ing risks

● Engineering ITA inputs to decisionsweighted more heavily

ASTROPOLITICS 129

Page 19: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

mitigation tool at NASA, particularly during the shuttle’s return-to-flight.Choosing this framework, a second pass through the interview data wasmade to determine what behavioral and decision structure changes weredeemed to be necessary to mitigate the identified hazards. These data, insome cases, overlap with that provided in Table 2, but provide a slightlydifferent lens through which to view the issues. The outputs of this analysisare shown in Table 3.

A third pass through the interview and secondary data, and subsequentsynthesis of this information, revealed a set of post-mishap initial conditionsfor hazard reduction which were believed to play a role in shaping the set ofmitigation activities to reduce the hazards identified.

Post-mishap initial conditions for hazard reduction

Throughout the tiered analysis of interview transcripts and secondary data, asthey apply to the hazard analysis framework, the authors noted that, for themodel to work, a set of initial, or conditioning, events needed to exist. It isanalogous to “priming the pump” or stimulating the model to avoid startuptransients. We believe that the following list of conditions are both necessaryand sufficient for the framework to produce the necessary hazard reductionswhich address less-than-adequate management structure, decision-makingmodalities, risk acceptance behaviors, processes/procedures, communica-tion/knowledge sharing, and cultural issues.

Figure 1. Subject risk acceptance (1–10) empirical cumulative distribution functions.

130 D. M. LENGYEL ET AL.

Page 20: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Mishap is an existential threat to the program—may cause programcancellation

As one program official explained, “This impetus to fly as soon as we couldnever left us. We have got a space station to build we have got to get on withit, if we sit on the ground long enough, Congress or the president will say,‘We’re done.’ And when we are done with the shuttle, we are also done withthe space station. Senior leadership of the agency saw this as an existentialthreat—the longer we stay on the ground, the more likely it was the wholeenterprise would get canceled.” Viewing the evidence of the threat allowedthe mustering of agency-wide resources to respond to the technical andorganizational challenges of return-to-flight. Without this threat, it couldbe debated whether those resources would be made available.

Major mishap—not a close-call event

Previous research has shown that close calls, also referred to as near-missevents, tend be interpreted by managers as “evidence of resiliency of thesystem.”20 The learning output from these events is therefore negligible. Amajor mishap, particularly one with national repercussions, is undeniablydifferent from a close call in size and scope of the response by the organiza-tion. It is distinguished not only by the resources committed to resolvingtechnical issues, but also by the organizational processes that must be re-

Table 3. Behavior and Decision Structure Changes Post-STS-107.Behavioral Changes Decision Structure Changes

Improve risk communication at all levels Reestablish senior agency leadershipinvolvement in program decisions

Remove complacency from decision making; improveengineering rigor

Balance cost, schedule, technical, and safetydecision attributes

Calibrate risk acceptance behavior Implementation of Independent TechnicalAuthority (ITA)

Ensure dissenting opinions are heard Implementation of formal risk managementDevelop a consensus approach to decision making Integration and coordination of ISS and SSP

decision makingDerive decisions with better data and technicalrationale

Review and improve decision forums (PRCB,FRR, and MMT)

Develop a more questioning attitude toward technicalissues and related decisions (engineering curiosity)

Institutionalize dissenting opinion resolution

Challenge belief system that we fully understand thevehicle/system

Clarify roles and responsibilities as well as riskacceptance authority

Improve knowledge sharing across the Space ShuttleProgram and NASA

Institutionalize independent technical analysis(set up NASA Engineering and Safety Center)

Develop a “get it right” culture Eliminate organizational stove piping—improveorganizational cohesivenessImprove technical collaboration across theagency (eliminate center rivalries)

ASTROPOLITICS 131

Page 21: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

examined and modified. A major accident also serves to indict the residualrisk acceptance posture of the organization.

Decision making is a proximate cause of the mishap

Without objective evidence in this area, it is doubtful that substantial changein decision-making behavior would have occurred. As it was, the CAIBreport indicted decision making at the program and real-time operationslevels; i.e., the MMT. This then required NASA-written responses as part ofits return-to-flight rationale that appropriate steps had been taken to fix theseareas of concern.

Mishap is a significant emotional event

As large as the Space Shuttle Program was, the flight crew was always treatedas a member of the family. For those who had personal interaction with thecrew from a management, engineering, training, or operations perspective,they were even closer. For those who interacted with the crew and theirfamilies in local schools, churches, or other groups, the loss was even morepoignant. The loss of the flight crew was a significant emotional event thatcaused an extraordinary amount of individual reflection and thoughts aboutwhat could have broken the accident chain of events. The emotions wereshared among NASA personnel, and as noted in previous research, “Socialsharing contributes to construct and consolidate shared knowledge about acollective experience and about people’s emotional responses to this event. Indoing so, it perpetuates the emotional climate associated with the sharing ofthe episode.”21

Senior leadership committed to making changes

Within a month of the Columbia accident, and in parallel with the investiga-tion, the NASA return-to-flight activities had begun. The release of the CAIBreport in August 2003 served to formalize the return-to-flight work alreadyin progress. Numerous congressional hearings had been conducted, and thepress also weighed in on NASA’s commitment to make the necessary changesto safely and successfully return the shuttle to a flight status. A senior NASAleader commented, “A lot of the program office folks, and a large contingentaround NASA entirely, really viewed some of the challenges in implementingthese recommendations, as a level of determination of risk that was greaterthan what they really believed was required at that point. Not withstandingthat, my view was that you just have got to stop somewhere or else we get theparalysis of analysis that would go on indeterminably. This was ‘snap a chalkline’ and say there it is and let’s get on with doing what is necessary to

132 D. M. LENGYEL ET AL.

Page 22: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

achieve the outcome we are looking for and everyone was in agreement withit, which was return-to-flight and complete the station and get on with themission at hand. That was a very controversial view and it was one that anawful lot of people were very unnerved about, but at the same time I thoughtthat short of doing something like that, we would be reinterpreting and usingthe time for that kind of argument rather than actually doing and getting onwith the task. By and large, everybody accepted the overwhelming predomi-nance of all the recommendations and findings and we were rolling that way.I said OK, this is a good time to move forward and close a chapter and moveto the next one and keep moving along. That was the source of a lot of debateas well. Short of that, I don’t know how we could have concluded this.”

Organization internalizes findings of mishap report

It could be argued that NASA’s Space Shuttle Program was a “high-reliabilityorganization,” or HRO, which has been used to describe, for example, thenuclear power industry as well as the nuclear submarine Navy. Certainly,human spaceflight risks require the engineering and technical rigor of man-agement decision-making acumen to keep the vacuum of space out of thehuman compartments, while the Navy requires keeping water out of itshuman compartments. Accident analyses are treated similarly across HROsand include “Building an organizational memory of what happened and why,develop a science of accidents that can happen in that organization, com-municate organization concern with accidents to reinforce cultural values ofsafety, and finally, identify parts of the organizations that should haveredundancies.”22

This research posits that the post-mishap initial conditions are necessaryto provide a forcing function for change. It is now possible to start theconstruction of a model which incorporates these findings. While the orderof progression in the hazard reduction framework may be up for debate, theprocess which follows still holds.

The first, and preferred, action in hazard analysis is to eliminate the hazard bydesigning it out of the system. In this case, it is assumed that means the removalof pre-mishap shuttle program management. As one senior NASA managernoted, “There was a decapitation of the shuttle program leadership—everyonethat had been in shuttle programmanagement before Columbia got reassigned.”This action initialized and provided support to the remaining hazard reductions.Substitution is the second alternative action and assumes that elimination of thehazard is not practicable. For this model, this implies changing the decisionattribute weights so that cost and schedule are no longer drivers—technical andsafety attributes receive higher weighting. The third alternative is engineeringcontrols, which implied establishing the ITA. This alternative established anindependent check and balance system to counter the autocratic powers of the

ASTROPOLITICS 133

Page 23: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

program manager. The fourth alternative, administrative controls, requires thatmore engineering test data are collected and that the program conduct moreformal risk analysis, such as continuous risk management. Finally, at the lowestrung of the hazard reduction process is safety equipment. This is the last line ofdefense in the process and the proposed equivalent in this model is managementtraining. At NASA, this occurred primarily, but not exclusively, at the real-timeoperational level, the MMT. Through these iterative actions, hazards can bemitigated or controlled. The emergent hazard reduction framework is shown inFigure 2.

Hazard analysis model outputs

A final analysis and synthesis of primary and secondary informationrevealed the following post-mishap outputs, or dependent variables, arethe outcomes of applying the hazard reduction framework to less thanadequate management structure(s), decision-making modalities, risk accep-tance behaviors, processes/procedures, communication/knowledge sharing,and cultural issues.

Develop and execute “Return to Operations” plan

In the case of the space shuttle, the return-to-flight plan was a means ofintegrating and documenting the technical and organizational responses tothe CAIB recommendations. This document was made publicly available,

Figure 2. Initial post-mishap hazard reduction framework.

134 D. M. LENGYEL ET AL.

Page 24: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

and served as the game plan for improving technical rigor, decision making,and residual risk acceptance behaviors.

Address compliance: Mishap report findings and recommendations

The original dictate after the accident was to “accept, comply, and embrace”the CAIB report findings and recommendations. After more than two years ofextensive engineering work, however, the agency needed to address full com-pliance with recommendations. One senior manager stated, “The reality is thatwhen you go about trying to develop the engineering fixes to address the CAIBrecommendations, some of those ideas didn’t look so good in retrospect, andsome of them were just not even possible to achieve. So I took an independentlook at all of that, and said, okay, some of these things we can do, and some ofthem we can’t. We’re just not going to do them. I don’t care what anybody saidbefore I got here. Some of these things can’t be implemented.”

Senior leadership must be engaged in program decision making

Leadership learned that simply following the decisions and activities of theSpace Shuttle Program was not nearly enough. As one senior leader put it,“We had demonstrated that if something went wrong on shuttle, it wasn’tjust the shuttle program manager who was going to be in trouble, it was theentire agency. If you are going to risk the agency, then agency-level manage-ment is going to be involved in those decisions.”

Institute independent technical authority

Together, these two outcomes served to improve technical rigor and decisionmaking within the agency. NASA’s own guidance required technical autho-rities to weigh in because “on decisions related to technical and operationalmatters involving safety and mission success residual risk, formal concur-rence by the responsible Technical Authorities is required.”23

Reinstate testing and acquisition of data, and improve analytic capabilities

One manager characterized the Columbia mishap as a failure of engineeringanalysis, a lack of penetration, and an acceptance of historical test data. Forexample, foam has previously come off of the external tank and it has never causedany damage, so it is OK.After themishap, the programperformed “a hell of a lot oftesting,” which the program had gotten away from. Both test and analyticalmodeling test data became more important. Prior to the mishap, the programhad limited test resources and limited analytical resources, which resulted in aculture of “down-home engineering kind of flight rationale generation.”

ASTROPOLITICS 135

Page 25: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

Institute a formal risk management process

The Space Shuttle Program started in the 1970s, and was waivered from per-forming formal continuous risk management when the agency CRM processwas initiated in the 1990s. After the Columbia mishap, a formal CRM processwas instituted. The risk matrix was used as a means of facilitating discussion ofrisks at all levels of the program as well as communicating risk to NASA HQ.

Management simulations and training to address flawed decision-makingmodalities

At the real-time operations level, as already shown in the anecdotal com-ments section, the MMT underwent a metamorphosis and was “revamped tobe a highly focused, well-trained, critical decision-making body.”24

Address dissenting opinion resolution

Post-mishap, one NASA manager, commenting on the topic of dissentingopinion, remarked, “The engineering organizations understood afterwardsthat they couldn’t just be potted plants—they had to stand up a little bit moreand they did. The other thing that happened very strongly was that thetriggers changed. Before Columbia, the triggers were oxygen fire, hatch,and O-rings. After Columbia, we added dissenting opinion and debris.Dissenting opinion became the new emotional trigger. The moment thatanybody said dissenting opinion, the argument went emotional for about anhour.” A number of years after the Columbia accident, NASA institutiona-lized dissenting opinion resolution. Per their program/project requirements,“NASA teams have full and open discussions, with all facts made available, tounderstand and assess issues. Diverse views are to be fostered and respectedin an environment of integrity and trust with no suppression or retribution.In the team environment in which NASA operates, team members often haveto determine where they stand on a decision. In assessing a decision oraction, a member has three choices: agree, disagree but be willing to fullysupport the decision, or disagree and raise a Dissenting Opinion.”25

Eliminate organizational stove piping and increase knowledge sharing

Internal to the Space Shuttle Program, barriers to effective knowledge sharingdecreased over time. Part of this was due to the fact, as one manager put it, “Whatwas stressed during return-to-flight was the necessity that everyone becomescomfortable with every decision. It was an issue because in an FRR you stillcannot bring all of the analysis to the table. But folks were encouraged to speakup, and were told that they were responsible for decisions even if they were not

136 D. M. LENGYEL ET AL.

Page 26: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

involved in the generation of the data or the analysis.”To accomplish this requiredincreased knowledge sharing across projects and centers. Over time, the Agencycreated the position of Chief Knowledge Officer, responsible to “coordinateagency-wide initiatives to advance capabilities in identifying, capturing, andtransferring knowledge.”26 The agency also set up, as part of the cadre of inde-pendent technical authority warrant holders, the NASA Engineering and SafetyCenter (NESC) as an independent testing and analysis group. NESC also serves asa cache of engineering “know how” across the agency.

Address cultural change issues

The MMT forum was specifically mentioned by the CAIB as an area forimprovement. As one senior program official stated, “After we instituted thesechanges and returned to flight and had a couple of missions under our belt, I hadmany people tell me over time how important it was what we did and howmuchthey appreciated the changes that were made and the things that were done andthe culture change that went forward. I think there is a general and broadacceptance among the working troops and the middle level managers that thiswas appropriate.” Other changes involving cultural change were institutionaliz-ing technical authority, which resulted in increased technical rigor and compli-ance with technical requirements. As noted by some researchers, “Targeted andintegrated cultural interventions, designed around changing a few critical beha-viors at a time, can also energize and engage your most talented people andenable them to collaborate more effectively and efficiently.”27

The model developed here points to a set of conditioning inputs, which theorganization uses to identify and develop controls for a set of prioritized hazards.The conditioning inputs allow amore thorough search for a solution set. The finalhazard reduction framework is shown in Figure 3. An obvious question, based onthe literature review of the hazard analysis process, is: Did NASA prioritize therisks in accordance with CAIB report recommendations? The answer is both yesand no. Many of the initiatives were direct responses to return-to-flight recom-mendations, and others were derived from the report. Prioritizing the risks mighthave been a means of allocating resources, but it appeared that NASA treated thehazard reduction framework as an integrated set of items that all must be resolvedfor the overall solution set to work effectively together.

Summary of model

The initial conditions, noted to the left of the hazard reduction framework, arebelieved to be necessary and sufficient initiating events to begin the process. Thefindings in the mishap report—(1) less than adequate management structure(s);(2) poor decision-making modalities; (3) misinformed risk acceptance beha-viors; (4) less-than-adequate management, engineering, or safety processes and

ASTROPOLITICS 137

Page 27: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

procedures; (5) inhibited communication and knowledge sharing; and (6) anddetrimental cultural issues—are then handled in the tiered hazard reductionframework. The outputs noted earlier are specific to NASA and the Columbiamishap, but should generically fit other accidents where decision making andrisk acceptance behavior are both indicted as contributing factors.

Conclusion

This research was pursued to determine how a major mishap serves to revisemanagement decision-making modalities and residual risk acceptance beha-vior at several levels of management. The iterative analysis of interviewtranscripts and secondary data inspired the construction of a hazard analysismodel to explain the post-mishap changes at NASA. This model contains aset of conditioning events, setting the stage for a hazard reduction frame-work, which addresses a prioritized list of organizational changes.

This model is extensible to other large technical/engineering organizationsbeyond NASA where the hazards involved in the business are significant,such as the loss of life or infrastructure. There are normative aspects to themodel developed through this research that will help organizations recoverfrom mishaps, perhaps the primary contribution of this work. There are alsoseveral theoretical prospects for future research to enhance understanding ofthe individual behavioral aspects of learning from mishaps and the potentialorganizational bridges and barriers to change. Future research might expand

Figure 3. Final hazard reduction framework.

138 D. M. LENGYEL ET AL.

Page 28: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

upon and contribute to the findings here by examining the Columbia casethrough the lens of Perrow’s normal accident theory28, Weick and Roberts’high reliability organizations paradigm,29 or Heimann’s research into reliabledecision making.30

Acknowledgments

The authors would like to acknowledge NASA and NASA-affiliated interviewees for theirinsights and the generosity of their time in the development and writing of this manuscript. Aspecial thanks to Walter H. Cantrell, RADM, USN (Ret.), former Co-Chair of the NASASpace Flight Leadership Council during the space shuttle return to flight timeframe, for hisexpert review and contributions.

Notes

1. The Columbia Accident Investigation Board, H. Gehman Jr. (ADM USN Ret.) et al.,“Columbia Accident Investigation Board Report,” vol. 1, http://www.nasa.gov/columbia/home/CAIBVol1.html (accessed January 2017).

2. N. Wayne Hale, “The Integrated Risk Acceptance Approach for Return-to-Flight,”(presentation to the Stafford-Covey Return-to-Flight Task Group, June 6, 2005),http://govinfo.library.unt.edu/returnflight/assets/pdf/Integrated_Risk_Acceptance.pdf(accessed May 2016).

3. Safety and Mission Assurance Directorate, National Aeronautics and SpaceAdministration (NASA), Johnson Space Center, “JSC Safety and Health Handbook,”JPR 1700.1K, Change 2, May 2006, https://jschandbook.jsc.nasa.gov/docs/revK/JPR1700-1RevK.pdf (accessed July 2016).

4. Ibid.5. Ibid.6. Diane Vaughn, The Challenger Launch Decision: Risky Technology, Culture, and

Deviance at NASA (Chicago, IL: University of Chicago Press, 1996).7. J. Park, T. Seager, P. S. C. Rao, M. Convertino, and I. Linkov, “Integrating Risk and

Resilience Approaches to Catastrophe Management in Engineering Systems,” RiskAnalysis 33, no. 3 (2013): 356–357.

8. J. Neumann and O. Morgenstern, Theory of Games and Economic Behavior (Princeton,NJ: Princeton University Press, 1944); D. Kahneman and A. Tversky, “Prospect Theory:An Analysis of Decision under Risk,” Econometrica 47, no. 2 (March 1979): 263–292;R. Keeney and H. Raiffa, Decisions with Multiple Objectives: Preferences and ValueTradeoffs (New York, NY: Cambridge University Press, 1993).

9. Samuel E. Bodily, “The Practice of Decision and Risk Analysis,” Interfaces 22, no. 6(November–December, 1992): 1–4.

10. W. G. Johnson, “The Management Oversight and Risk Tree (MORT) IncludingSystems Developed by the Idaho Operations Office and Aerojet Nuclear Company,”prepared for the U.S. Atomic Energy Commission, Division of Operational Safety,February 12, 1973 (Scoville, ID: Aerojet Nuclear Company).

11. Herbert J. Rubin and Irene S. Rubin, Qualitative Interviewing: The Art of Hearing Data,2nd ed. (London, UK: Sage Publications, 2005).

12. M. Meyer and J. Booker, Eliciting and Analyzing Expert Judgment: A Practical Guide(Philadelphia, PA: American Statistical Association, 2001).

ASTROPOLITICS 139

Page 29: Accident Case of NASA and the Space Shuttle Columbia … Sponsored Documents... · 2017-07-31 · RESEARCH ARTICLE Evaluating Decision-Making Modalities and Risk Acceptance Behavior

13. Earl Babbie, The Basics of Social Research, 4th ed. (Belmont, CA: Thomson-Wadsworth,2008).

14. R. Keeney and H. Raiffa, Decisions with Multiple Objectives: Preferences and ValueTradeoffs (New York, NY: Cambridge University Press, 1993); S. French, DecisionTheory: An Introduction to the Mathematics of Rationality (West Sussex, UK: EllisHorwood Limited, 1988).

15. J. Yopchick and N. Kim, Hindsight Bias and Causal Reasoning: A Minimalist Approach(Berlin, Germany: Springer, 2012).

16. J. Saldana, The Coding Manual for Qualitative Researchers, 2nd ed. (Los Angeles, CA:Sage, 2013).

17. Diaz Team Report, “The Implementation of the NASA Agency-Wide Application ofthe Columbia Accident Investigation Report: Our Renewed Commitment toExcellence,” NASA, http://www.nasa.gov/pdf/58676mainImplementation%20033004%20FINAL.pdf (accessed July 2016).

18. Ibid.19. Ibid.20. Robin L. Dillon and Catherine H. Tinsley, “How Near-Misses Influence Decision

Making under Risk: A Missed Opportunity for Learning,” Management Science 54,no. 8 (2008): 1425–1440.

21. B. Rimé, “The Social Sharing of Emotion as an Interface Between Individual andCollective Processes in the Construction of Emotional Climates,” Journal of SocialIssues 63, no. 2 (2007): 307–322.

22. K. Roberts, R. Bea, and D. Bartles, “Must Accidents Happen? Lessons from High-Reliability Organizations,” Academy of Management Executive 15, no. 3 (2001): 70–79.

23. Office of the Chief Engineer, National Aeronautics and Space Administration (NASA),NASA Headquarters, “NASA Space Flight Program and Project ManagementRequirements,” (NPR 7120.5E, August 14, 2012), http://nodis3.gsfc.nasa.gov/npgimg/NPR7120005E/NPR7120005E.pdf (accessed July 2016).

24. N. Wayne Hale, “The Integrated Risk Acceptance Approach for Return-to-Flight,”presentation to the Stafford-Covey Return-to-Flight Task Group, June 6, 2005, http://govinfo.library.unt.edu/returnflight/assets/pdf/Integrated_Risk_Acceptance.pdf(accessed May 2016).

25. Office of the Chief Engineer, National Aeronautics and Space Administration (NASA),NASA Headquarters, “NASA Space Flight Program and Project ManagementRequirements,” (NPR 7120.5E, August 14, 2012), http://nodis3.gsfc.nasa.gov/npgimg/NPR7120005E/NPR7120005E.pdf (accessed July 2016).

26. Office of the Chief Engineer, National Aeronautics and Space Administration (NASA),NASA Headquarters, “NASA Knowledge Policy on Programs and Projects” (NPD 7120.6,November 26, 2013) (Washington, DC: National Aeronautics and Space Administration,2013).

27. Jon R. Katzenbach, Ilona Steffen, and Caroline Kronley, “Cultural Change that Sticks,”Harvard Business Review 90, no. 7 (July-August 2012): 8.

28. C. Perrow, Normal Accidents: Living with High-Risk Technologies (Princeton, NJ:Princeton University Press, 1984).

29. K. Weick and K. Roberts, “Collective Mind in Organizations: Heedful Interrelating onFlight Decks,” Administrative Science Quarterly 38 (1993): 357–381.

30. C. Heimann, Acceptable Risks: Politics, Policy, and Risky Technologies, (Ann Arbor, MI:University of Michigan Press, 1997).

140 D. M. LENGYEL ET AL.