Tema TTIT62 - Realtidsprocesser och reglering
Fel som leder till fel
Den 4:e juni 1996, jungfrufärden av Ariane 5:s bärraket
slutade i en katastrof. Efter endast 40 sekunder från
initialiseringen
av flygprogrammet, vid en höjd av 3700 m tappade raketen sin kurs,
bröt sönder och exploderade.
Omedelbart efter olyckan tillsattes en haverikommission.
Kommissionen
analyserade telemetridata från raketen upptill sista
stunden, trajektoriedata från markbundna radarstationer och
optiska
observationer, t.ex. kameror.
Haverikommissionens rapport som lades fram 19 juli 1996, pekar ut
flera
grundläggande orsaker i form av designfel i programvara,
odiciplinerad
undantagshantering och brister i systemutvecklingsmetodik, i synnerhet
vid kravspecifikation och testning.
Härnedan följer ett urklipp av rapporten
som återger händelsekedjan.
In general terms, the Flight Control System of the Ariane 5 is of a
standard design. The attitude of the launcher and its movements in
space
are measured by an Inertial Reference System (SRI). It has its own
internal
computer, in which angles and velocities are calculated on the basis of
nformation from a "strap-down" inertial platform, with laser gyros and
accelerometers. The data from the SRI are transmitted through the
databus
to the On-Board Computer (OBC), which executes the flight program and
controls
the nozzles of the solid boosters and the Vulcain cryogenic engine,via
servovalves and hydraulic actuators.
In order to improve reliability there is considerable redundancy at
equipment level. There are two SRIs operating in parallel, with
identical
hardware and software. One SRI is active and one is in "hot" stand-by,
and if the OBC detects that the active SRI has failed it immediately
switches
to the other one, provided that this unit is functioning properly.
Likewise
there are two OBCs, and a number of other units in the Flight Control
System
are also duplicated.
The design of the Ariane 5 SRI is practically the same as that of an
SRI which is presently used on Ariane 4, particularly as regards the
software.
Based on the extensive documentation and data on the Ariane 501
failure
made available to the Board, the following chain of events,their
inter-relations
and causes have been established, starting with the destruction of the
launcher and tracing back in time towards the primary cause.
-
The launcher started to disintegrate at about H0 + 39 seconds
because
of high aerodynamic loads due to an angle of attack of more than 20
degrees
that led to separation of the boosters from the main stage, in turn
triggering
the self-destruct system of the launcher.
-
This angle of attack was caused by full nozzle deflections of the solid
boosters and the Vulcain main engine.
-
These nozzle deflections were commanded by the On-Board Computer (OBC)
software on the basis of data transmitted by the active Inertial
Reference
System (SRI 2). Part of these data at that time did not contain proper
flight data, but showed a diagnostic bit pattern of the computer of the
SRI 2, which was interpreted as flight data.
-
The reason why the active SRI 2 did not send correct attitude data was
that the unit had declared a failure due to a software exception.
-
The OBC could not switch to the back-up SRI 1 because that unit had
already
ceased to function during the previous data cycle (72 milliseconds
period)
for the same reason as SRI 2.
-
The internal SRI software exception was caused during execution of a
data
conversion from 64-bit floating point to 16-bit signed integer value.
The
floating point number which was converted had a value greater than what
could be represented by a 16-bit signed integer. This resulted in an
Operand
Error. The data conversion instructions (in Ada code) were not
protected
from causing an Operand Error, although other conversions of comparable
variables in the same place in the code were protected.
-
The error occurred in a part of the software that only performs
alignment
of the strap-down inertial platform. This software module computes
meaningful
results only before lift-off. As soon as the launcher lifts off, this
function
serves no purpose.
-
The alignment function is operative for 50 seconds after starting of
the
Flight Mode of the SRIs which occurs at H0 - 3 seconds for Ariane 5.
Consequently,
when lift-off occurs, the function continues for approx. 40 seconds of
flight. This time sequence is based on a requirement of Ariane 4 and is
not required for Ariane 5.
-
The Operand Error occurred due to an unexpected high value of an
internal
alignment function result called BH, Horizontal Bias, related to the
horizontal
velocity sensed by the platform. This value is calculated as an
indicator
for alignment precision over time.
-
The value of BH was much higher than expected because the early part of
the trajectory of Ariane 5 differs from that of Ariane 4 and results in
considerably higher horizontal velocity values.
The SRI internal events that led to the failure have been reproduced by
simulation calculations. Furthermore, both SRIs were recovered during
the
Board's investigation and the failure context was precisely determined
from memory readouts. In addition, the Board has examined the software
code which was shown to be consistent with the failure scenario. The
results
of these examinations are documented in the Technical Report.
Therefore, it is established beyond reasonable doubt that the chain
of events set out above reflects the technical causes of the failure of
Ariane 501.
Simin Nadjm-Tehrani
Last Modified: Thu 12 Jan 2006