Bugs in the Space Program

The talk:

This is the keynote address I gave at the INCOSE International Symposium on Systems Engineering, in July 2005.

The slides for the talk.

Abstract:

Software has an important role to play in making modern systems more flexible, adaptable and autonomous. But we don't yet have a mature engineering discipline for software development. For the systems engineer, important questions are still unanswered: how risky is software in comparison with other parts of a system? Can software be treated as 'just another component'? Or does software demand special attention in systems engineering?

The emerging field of software forensics can shed some light on these questions. By investigating the circumstances surrounding software failures, we get a sense of the risks involved. In this talk, I will use a series of case studies from the space program to draw out some crucial lessons. The examples include the European Space Agency's original Ariane-5 launch vehicle, and several of NASA's Mars probes. Each of these case studies makes a fascinating story in its own right. In each case, the failure appears to be a normal accident: a relatively simple technical problem led to a systems failure because a whole series of systems engineering mistakes allowed it to. However, the failure profiles in these cases reveal some of the key distinguishing characteristics of software. These characteristics have important implications for systems engineering.

The Resource List:

Recommended books:

A Handbook of Software and Systems Engineering: Empirical Observations, Laws, and Theories. by Endres and Rombach.
Mechanizing Proof: Computing, Risk, and Trust , by Donald MacKenzie.
Visual Explanations, Images and Quantities, Evidence and Narrative. by Edward Tufte.
Normal Accidents by Charles Perrow
To Engineer is Human: The Role of Failure in Successful Design, by Henry Petroski.
Safeware: System Safety and Computers by Nancy Leveson
What Do You Care What Other People think? by Richard Feynman

Space Shuttle

Current info about the shuttle:
- http://spaceflight.nasa.gov/shuttle/
Info about Challenger:
- http://www-pao.ksc.nasa.gov/kscpao/shuttle/missions/51-l/mission-51-l.html
Rogers Commission Report (see especially appendix F, by Richard Feynman)
- http://science.ksc.nasa.gov/shuttle/missions/51-l/docs/rogers-commission/table-of-contents.html
A Succinct summary of the key factors and issues:
- http://ethics.tamu.edu/ethics/ethics/shuttle/shuttle1.htm

Ariane-5

Info about ESA’s launchers:
- http://www.esa.int/export/esaLA/launchers.html
Inquiry report & Press release:
- http://www.esrin.esa.it/htdocs/tidc/Press/Press96/press33.html

Mariner 1

Data sheet
- http://nssdc.gsfc.nasa.gov/nmc/tmp/MARIN1.html
Summary of problems from Risks Digest vol 5 no 73
- http://catless.ncl.ac.uk/Risks/5.73.html

Mars Observer

Project summary
- http://www.msss.com/mars/observer/project/mo_loss/moloss.html
Brief summary of possible causes
- http://catless.ncl.ac.uk/Risks/14.89.html#subj1

Mars Pathfinder

Project info:
- http://mars.jpl.nasa.gov/MPF/index1.html
Report on the priority inversion problem:
- http://catless.ncl.ac.uk/Risks/19.49.html#subj1

Mars Climate Orbiter

Project Info:
- http://mars.jpl.nasa.gov/msp98/orbiter/
Investigation Report:
- ftp://ftp.hq.nasa.gov/pub/pao/reports/2000/MCO_MIB_Report.pdf

Mars Polar Lander & Deep Space 2

Project info:
- http://mars.jpl.nasa.gov/msp98/lander/
- http://mars.jpl.nasa.gov/msp98/ds2/
Investigation Reports:
- http://www.nasa.gov/newsinfo/marsreports.html

General Resources

RISKS forum archive:
- http://catless.ncl.ac.uk/Risks/
JPL’s list of missions (past, present and future)
- http://www.jpl.nasa.gov/missions/missions_index.html
Basics of Space Flight:
- http://www.jpl.nasa.gov/basics/