Appendix F - Personal observations on the reliability of the Shuttle
by R. P. Feynman

Introduction
It appears that there are enormous differences of opinion as to the
probability of a failure with loss of vehicle and of human life. The
estimates range from roughly 1 in 100 to 1 in 100,000. The higher
figures come from the working engineers, and the very low figures from
management. What are the causes and consequences of this lack of
agreement? Since 1 part in 100,000 would imply that one could put a
Shuttle up each day for 300 years expecting to lose only one, we could
properly ask "What is the cause of management's fantastic faith in the
machinery?"

We have also found that certification criteria used in Flight
Readiness Reviews often develop a gradually decreasing strictness. The
argument that the same risk was flown before without failure is often
accepted as an argument for the safety of accepting it again. Because
of this, obvious weaknesses are accepted again and again, sometimes
without a sufficiently serious attempt to remedy them, or to delay a
flight because of their continued presence.

There are several sources of information. There are published criteria
for certification, including a history of modifications in the form of
waivers and deviations. In addition, the records of the Flight
Readiness Reviews for each flight document the arguments used to
accept the risks of the flight. Information was obtained from the
direct testimony and the reports of the range safety officer, Louis
J. Ullian, with respect to the history of success of solid fuel
rockets. There was a further study by him (as chairman of the launch
abort safety panel (LASP)) in an attempt to determine the risks
involved in possible accidents leading to radioactive contamination
from attempting to fly a plutonium power supply (RTG) for future
planetary missions. The NASA study of the same question is also
available. For the History of the Space Shuttle Main Engines,
interviews with management and engineers at Marshall, and informal
interviews with engineers at Rocketdyne, were made. An independent
(Cal Tech) mechanical engineer who consulted for NASA about engines
was also interviewed informally. A visit to Johnson was made to gather
information on the reliability of the avionics (computers, sensors,
and effectors). Finally there is a report "A Review of Certification
Practices, Potentially Applicable to Man-rated Reusable Rocket
Engines," prepared at the Jet Propulsion Laboratory by N. Moore, et
al., in February, 1986, for NASA Headquarters, Office of Space
Flight. It deals with the methods used by the FAA and the military to
certify their gas turbine and rocket engines. These authors were also
interviewed informally.

Solid Rockets (SRB)

An estimate of the reliability of solid rockets was made by the range
safety officer, by studying the experience of all previous rocket
flights. Out of a total of nearly 2,900 flights, 121 failed (1 in
25). This includes, however, what may be called, early errors, rockets
flown for the first few times in which design errors are discovered
and fixed. A more reasonable figure for the mature rockets might be 1
in 50. With special care in the selection of parts and in inspection,
a figure of below 1 in 100 might be achieved but 1 in 1,000 is
probably not attainable with today's technology. (Since there are two
rockets on the Shuttle, these rocket failure rates must be doubled to
get Shuttle failure rates from Solid Rocket Booster failure.)

NASA officials argue that the figure is much lower. They point out
that these figures are for unmanned rockets but since the Shuttle is a
manned vehicle "the probability of mission success is necessarily very
close to 1.0." It is not very clear what this phrase means. Does it
mean it is close to 1 or that it ought to be close to 1? They go on to
explain "Historically this extremely high degree of mission success
has given rise to a difference in philosophy between manned space
flight programs and unmanned programs; i.e., numerical probability
usage versus engineering judgment." (These quotations are from "Space
Shuttle Data for Planetary Mission RTG Safety Analysis," Pages 3-1,
3-1, February 15, 1985, NASA, JSC.) It is true that if the probability
of failure was as low as 1 in 100,000 it would take an inordinate
number of tests to determine it ( you would get nothing but a string
of perfect flights from which no precise figure, other than that the
probability is likely less than the number of such flights in the
string so far). But, if the real probability is not so small, flights
would show troubles, near failures, and possible actual failures with
a reasonable number of trials. and standard statistical methods could
give a reasonable estimate. In fact, previous NASA experience had
shown, on occasion, just such difficulties, near accidents, and
accidents, all giving warning that the probability of flight failure
was not so very small. The inconsistency of the argument not to
determine reliability through historical experience, as the range
safety officer did, is that NASA also appeals to history, beginning
"Historically this high degree of mission success..."

Finally, if we are to replace standard numerical probability usage
with engineering judgment, why do we find such an enormous disparity
between the management estimate and the judgment of the engineers? It
would appear that, for whatever purpose, be it for internal or
external consumption, the management of NASA exaggerates the
reliability of its product, to the point of fantasy.

The history of the certification and Flight Readiness Reviews will not
be repeated here. (See other part of Commission reports.) The
phenomenon of accepting for flight, seals that had shown erosion and
blow-by in previous flights, is very clear. The Challenger flight is
an excellent example. There are several references to flights that had
gone before. The acceptance and success of these flights is taken as
evidence of safety. But erosion and blow-by are not what the design
expected. They are warnings that something is wrong. The equipment is
not operating as expected, and therefore there is a danger that it can
operate with even wider deviations in this unexpected and not
thoroughly understood way. The fact that this danger did not lead to a
catastrophe before is no guarantee that it will not the next time,
unless it is completely understood. When playing Russian roulette the
fact that the first shot got off safely is little comfort for the
next. The origin and consequences of the erosion and blow-by were not
understood. They did not occur equally on all flights and all joints;
sometimes more, and sometimes less. Why not sometime, when whatever
conditions determined it were right, still more leading to
catastrophe?

In spite of these variations from case to case, officials behaved as
if they understood it, giving apparently logical arguments to each
other often depending on the "success" of previous flights. For
example. in determining if flight 51-L was safe to fly in the face of
ring erosion in flight 51-C, it was noted that the erosion depth was
only one-third of the radius. It had been noted in an experiment
cutting the ring that cutting it as deep as one radius was necessary
before the ring failed. Instead of being very concerned that
variations of poorly understood conditions might reasonably create a
deeper erosion this time, it was asserted, there was "a safety factor
of three." This is a strange use of the engineer's term ,"safety
factor." If a bridge is built to withstand a certain load without the
beams permanently deforming, cracking, or breaking, it may be designed
for the materials used to actually stand up under three times the
load. This "safety factor" is to allow for uncertain excesses of load,
or unknown extra loads, or weaknesses in the material that might have
unexpected flaws, etc. If now the expected load comes on to the new
bridge and a crack appears in a beam, this is a failure of the
design. There was no safety factor at all; even though the bridge did
not actually collapse because the crack went only one-third of the way
through the beam. The O-rings of the Solid Rocket Boosters were not
designed to erode. Erosion was a clue that something was wrong.
Erosion was not something from which safety can be inferred.

There was no way, without full understanding, that one could have
confidence that conditions the next time might not produce erosion
three times more severe than the time before. Nevertheless, officials
fooled themselves into thinking they had such understanding and
confidence, in spite of the peculiar variations from case to case. A
mathematical model was made to calculate erosion. This was a model
based not on physical understanding but on empirical curve fitting....MORE