1 8/8/00 the failure of a small satellite and the loss of a space science mission r. katz national...

57
8/8/00 1 The Failure of a Small Satellite and the Loss of a Space Science Mission atz onal Aeronautics and Space Administration trical Systems Center ard Space Flight Center

Upload: blaze-mason

Post on 20-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 1

The Failure of a Small Satellite and the Loss of a Space Science Mission

R. KatzNational Aeronautics and Space AdministrationElectrical Systems CenterGoddard Space Flight Center

Page 2: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 2

Overview

• Background and Introduction

• How did the mission* fail?

• Why did mission fail?

* SMEX/WIRE Small Explorer Wide Field Infrared Explorer

Page 3: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 3

"rk"

• Experience: JPL, NASA GSFC

• Design Engineer, Electrical• Galileo, Magellan, Cassini, ISTP, SIRTF, MGS, SMEX, etc.

• Research and Technology Development• Logic, FPGAs, Radiation, Design Techniques

• Reviews, Failure Investigations• Cassini, HST, EOS-AM, AXAF, HETE-2, SIRTF, etc.• Small Explorer WIRE

Page 4: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 4

Failure Examples (Simplified)Mars Climate Orbiter Units

Mars Polar Lander 1 Line of Missing Software

Ariane V/501 Operand Error, Unprotected

Sea Launch Ground S/W Logic; Valve Config

Intelsat VI “Two wires crossed”

Terriers Inverted Sign

IUS 21 Tape/Thermal Wrap

Titan IV Data Entry Error

SMEX/WIRE 1 Wire, Disable Buffer

Page 5: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 5

Payload/Launcher Success Rates

Year

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999

Su

cce

ss R

ate

(%

)

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

PayloadPayload FitLaunch VehicleLauncher Fit

Page 6: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 6

1999 Payload Failures

1. WIRE (NASA)

2. TERRIERS (Boston University/AeroAstro)

3. Abrixas (Germany)

4. SACI 1 (Brazil)

All Small Scientific Satellites

Page 7: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 7

Small Explorer (SMEX) Program

Spacecraft Mass(kg) Launch Date

Galileo 2,562 1989

SMEX 150-300 1992-1999

SMEX/WIRE 250 1999

UoSAT-12 325 1999

SNAP-1 7 2000

Page 8: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 8

Wide-Field Infrared ExplorerProgrammatic

PI: JPL

Spacecraft: NASA Goddard Space Flight Center

Instrument: Utah State University - SDL

Launch: Orbital Science Corp. - Pegasus XL

Cost: $75 million

Duration: 4 Months

Page 9: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 9

Wide-Field Infrared ExplorerTechnical

Objective: Deep Infrared, Extragalactic Survey

Detectors: Two 128 x 128 Si:As Arrays

Telescope: 30 cm Cassegrain

Cryostat: Solid Hydrogen; Dual Stage 7 K/12 K.

Orbit: 540 kilometer

Page 10: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 10

Logic System Overview

RelayFET

PYRO

SPE

SCS

+28V

ARM

FIRE

LM117REG

+5VDC

CRYSTALOSC

+5VDC

200 kHz

PORR,C, 4093B

+5VDC

POR PULSE

+5VDC

A1020

200 kHz

POR

ARM

FIRE

PYRO BOXSpacecraft

Page 11: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 11

WIRE Spacecraft

CompositeSpacecraft

Star Tracker

Modular Solar Array

ApertureShade

Page 12: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 12

The WIRE Mission

March 4th: Launch, Vandenberg Air Force Base/L-1011

T+9 min: Separation Nominal

T+29 min: Antarctica Pass - Vent Command Xmitted

T+79 min: NORAD Tracks 3 Objects, Including Cover

T+99 min: Alaska Pass - Tumbling*

T+36 Hrs: Cryogen Supply Exhausted

March 8th: Mission Declared Lost

* Eventually Spun up to 60 rpm

Page 13: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 13

Loss of Control - Telemetry

Page 14: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 14

Root Cause of Failure (1)

The root cause of a failure is the mechanism that directly caused the mishap.

Significant contributing causes include events or conditions that could have been used to identify this condition as the phenomena has been understood.

Contributing factors are other events or conditions that might have been able to prevent the mishap and should have been done significantly better.

Page 15: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 15

Root Cause of Failure (2)The root cause of the WIRE mission loss is a digital logic design error in the instrument pyro electronics box.

The transient performance of components was not adequately accounted for in its design.

The failure was caused by two distinct mechanisms that, either singly or in concert, resulted in inadvertent pyrotechnic device firing during the initial pyro box power-up.

Page 16: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 16

Requirements for Failure

• Design Error (2)• Errors Not Caught In:

– Analysis– Simulation– Design Reviews– Box Level Tests– Instrument Level Tests– Spacecraft Integration Tests– Spacecraft Systems Tests– Final Reviews

Page 17: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 17

SMEX/WIRE System

Page 18: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 18

Why Did WIRE “Spin Up?”

• Zero Thrust Vent - a “T.”

• Vent Located To Minimize Pressure (Temperature).

• One Side of “T” Pointed At Connector.

• No Analysis of Exit Design During a Worst-Case Venting Scenario.

• ACS Could Not Overcome Force

• Spun Up To 60 RPM

Page 19: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 19

"System" Perspective

Pyros

Spacecraft Power

Electronics

SpacecraftComputer

System

(80386/387)

+28V

ARM

FIRE

PYRO BOX

Spacecraft Instrument

+28VBUS

A 4th level of protection was an arming plug.

Cover

"PYRO Subsystem"

Pyros

Vent

Page 20: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 20

Basic Pyro Characteristics

• NASA Standard Initiator, Type 1 (NSI-1)

• No-Fire: 1 Amp and 1 Watt for 5 minutes

• Bridgewire Impedance: ~ 1

• Fire Time: ~ 1 ms @ 5 amps

Page 21: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 21

Vent Cover

"Pyro Box" Perspective

Power +28V

InstrumentPyro Box

MultiplePyro

Functions

Pyro Box is poweredoff during launch

Logic Signal Arm

• Pulse forming• Timing.• Lockouts.• Filtering.

Logic Signal Fire FPGA - Complex• FSM• Counters

Page 22: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 22

Voltage Regulation

Page 23: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 23

Regulator Circuit

15 F and 0.1 F capacitors.

+28V IN

+5V OUT

Page 24: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 24

EM Regulator Performance

+5 VDC

+28V

5 ms/Div

Page 25: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 25

Logic Design (1)

Reset Circuitryand

Crystal Clock Oscillator

Page 26: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 26

Flight Oscillator on System Board

Page 27: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 27

Crystal Oscillator CharacteristicsIt is known that crystal oscillators do not start immediately with the application of power. From Horowitz and Hill's The Art of Electronics, 2nd Edition:

... However, because of its high-resonant Q, a crystal oscillator cannot start up instantaneously, and an oscillator in the megahertz range typically takes 5-20 ms to start up; a 32 kHz oscillator can take up to a second (Q = 105). ...

Start up time for oscillators is sometimes not included in the specification.

- SMEX/WIRE Class S screening specification did not include a start up time limit.

Page 28: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 28

Example Oscillator Start Time

200 kHz

+5 VDC

1 ms/Div

Power Supply Rise Time = 1 ms for this example

Page 29: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 29

Summary of Oscillator Start TimesSMEX WIREOscillator Startup Time Test

T = 10C

Power Supply Rise Time (msec)Measured from 10%-90%

0 50 100 150 200

Sta

rt T

ime

(mse

c)F

rom

Pow

er S

uppl

y @

Sta

rtup

1

10

100

1000

Page 30: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 30

Summary of Oscillator Start TimesSMEX WIRE

Oscillator Startup Time TestT = 10C

Power Supply Rise Time (msec)Measured from 10%-90%

0 50 100 150 200

Sta

rt T

ime

(mse

c)F

rom

Pow

er S

uppl

y @

Sta

rtup

0

50

100

150

200

250

Page 31: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 31

Oscillator Startup on WIRE EM

+28V

+5V

200 kHzOscillator

Output

5 ms/Div

23 ms

Page 32: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 32

• Reset Flip-Flips– 3 Flip-Flops– At Least One Must Be A “0” To Be Safe– 7 Chances In 8

• ARMCNT Block– 14 Flip-Flops– All Must Be A “0” To Be Safe– One Chance In 16,384

• TIMECNT Block– 8 Flip-Flops– All Must Be A “0” To Be Safe– One Chance In 256

Note: Two SidesPFailure ~ 25%

Logic AnalysisAssuming Random Power Up Of Flip-Flops

Page 33: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 33

Logic Design (2)

FPGA Transient Behavior

Page 34: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 34

FPGA and Drivers

RelayFET

PYRO

+5VDC

A1020FPGA

200 kHz

POR

ARM

FIRE

+28 VDC

Page 35: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 35

FPGA Implementation:Charge Pump And Isolation FETs

CHARGE

PUMP

HV Isolation FETs

Module Input

ModuleOutput

Antifuse

Page 36: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 36

A1020 Output TransientOverview

Documented In Actel App Notes; EEE Links, WWW Site

Not Documented In Data Sheet

I/O May Power-up Uncontrolled

Inputs May Source Current

Outputs May Be Invalid

Truth Tables Not Followed

Device Architecture

Requires HV Isolation

FETs ON

Charge Pump Needs

Time To Start, Bias HV

FETs

Page 37: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 37

Output Transient - Investigation

• Flight Pattern Obtained From SDL

• Devices Programmed For Bench Test

– A1020B’s (3)

– Non-flight A1020 (1)

– Flight A1020 (2)

• Transients Observed On Critical Outputs

• Critical Outputs May Be Latched High

Page 38: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 38

A1020 Sample Transient

5 ms/Div

Cover

Arm

VCC

Device Had Been Powered Off For 2 Days

Page 39: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 39

A1020 FPGA Output Transient Summary

Longer power supply rise times

Increase the probability of the transient

Increase the size of the transient

Quick power cycles tend to eliminate transients

Long power-off times tend to increase the chance of a transient (memory effect).

Now it was known how to test the Engineering Model

Page 40: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 40

Failure Demonstration on EM

A Side Power Input5 A/Div

13.5 msec

1.6 msec

Page 41: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 41

Instrument Level Testing

Fidelity of Spacecraft Power Electronics (SPE) Simulation

Page 42: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 42

Relay Operating CharacteristicsWIRE Failure AnalysisRelay Operate TimeFlight Spare S/N 001

NASA GSFCMay 19, 1999

Coil Voltage (volts)

11 12 13 14 15 16 17 18

Ope

rate

Tim

e (m

sec)

0

20

40

60

80

100

120

140

Notes:

1. Pulse width = 800 msec2. Neither of the two relays would operate at 11V

Page 43: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 43

+28V Bench Power SupplyInstrument Level Testing

50 ms / Div

10V / Div

LogicBegins ToFunction

RelayStarts ToOperate

RelayCloses

Page 44: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 44

Spacecraft Level Testing

Fidelity of Pyrotechnic Simulation

Page 45: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 45

EED Simulator - Input Stage

Easy To “Trip”

Low-Impedance Switched In After Delay

Page 46: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 46

EED Simulator - Delay

+5VDC2V/Div

CURRENT1 A/Div

10 ms/Div

23 ms

Page 47: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 47

Spacecraft Level Testing

Problem Reporting and Analysis

Page 48: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 48

Reporting Mechanism Not Used

• Simulator Box Tripped In System Level Tests

• Procedure Was To Reset The Simulator– Dispositioned "OK" By Similarity to Previous

Mission With Different Hardware Set– Not Troubleshot in Depth– Design Engineer Not Involved

• No Failure Report Written– Eliminated Reviews of Failure Report

Page 49: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 49

Conclusions

and

Points for Discussion

Page 50: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 50

Reviews• Single System Review

• Pyro Box Not Ready For Review– Never Reviewed: “Fell Through The Cracks”

• Would Reviews Prevented Mission Loss?– SDL Engineers Not Familiar With Startup

Transient In A1020 Device– Neither Was The Local Actel FAE– Customer Review Board Members?

Makeup Of Review Teams And Depth of Reviews Are Critical

Page 51: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 51

Simulation• Simulation Is A Valuable Tool

• Simulation And Analysis Work On Models Of Hardware

• Simulation Models Are Not 100% Accurate.– Frequently Poor For Transient Conditions, Like

Startup, In Digital Circuits

Logic Simulation (Machine) Can Not Replace Analysis (Human)

Page 52: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 52

Testing• Fidelity of Test Equipment Critical

• “Test As You Fly; Fly As You Test”– End-to-End Testing with Realistic Timelines

• Qualification By Test Limited– Reliability Can Not Be "Tested Into" A System

• Process For Failure Reporting And Disposition.– Real-Time Disposition Without Full

Documentation and Proper Analysis Is Obviously Risky

Page 53: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 53

Fault Tolerance• Designers Concerned About Getting The

Cover Off, Not Keeping It On.

• Analysis Of Worst-Case Situations– Worst-Case Venting Scenarios for WIRE Not

Analyzed.

• Sizing of Attitude Control System

Trade-offs of Risk vs. Cost/Size vs. Performance

Page 54: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 54

Complexity• Design More Complex Than Required.

• Extra Protection Features.

• Redundancy Doubled Probability of Failure.– Parallel Reliability Model For WIRE Architecture

• WIRE Designers' Comments on Critical Function Architectures– “KISS” principle

• Analyze requirements

• Develop several approaches to meet same

• Analyze from different perspectives

• Pick the simplest one, all other things being equal

Page 55: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 55

Mission Outcome As AFunction of Complexity and Budget

WIRE

Page 56: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 56

Mission Outcome As A Function of Complexity and Schedule

WIRE

Page 57: 1 8/8/00 The Failure of a Small Satellite and the Loss of a Space Science Mission R. Katz National Aeronautics and Space Administration Electrical Systems

8/8/00 57

Additional Reading and References• “WIRE Mishap Investigation Board Report,” Darrell R. Brancome, Chairman,

NASA Headquarters, June 8, 1999.

• “Small Explorer WIRE Failure Investigation Report,” Richard B. Katz, NASA Goddard Space Flight Center, May 29, 1999.

• “Startup Design and Analysis Note, ” Richard B. Katz, NASA Goddard Space Flight Center, May 12, 1999.

• “Start up Application Concerns with Actel Corp. Field Programmable Gate Arrays (FPGAs),” NASA Parts Advisory NA-046, May 27, 1999.

• “Why Space Mishaps Are On The Rise,” Marco Caceres, AIAA Aerospace America, July 2000, pp. 18-20.

• "Use of FPGA's in Critical Space Flight Applications-A Hard Lesson,” W. Gibbons and H. Ames, Utah State University, Mil/Aero Applications of Programmable Logic Devices International Conference, 1999.

• Failure Reports For Various Missions Collected at: http://rk.gsfc.nasa.gov/reports.htm

• "Aerospace Corp. Study Shows Limits of Faster-Better-Cheaper," Michael A. Dornheim, pp. Aviation Week & Space Technology, June 12, 2000, pp. 47-49

• "Recovery of the Wide-Field Infrared Explorer Spacecraft," D. Everett, T. Correll, S. Schick, and K. Brown, 14th Annual AIAA/USU Conference on Small Satellites, 2000.