failure assessment may be difficult(Issue #25) - It may be difficult to detect, diagnose, and evaluate the consequences of automation failures (errors and malfunctions), especially when behavior seems 'reasonable', possibly resulting in faulty or prolonged decision making.

Evidence Type:

Excerpt from Experiment

Evidence:

"Attitude indicator (ADI) failure. When the attitude indicator failed, it drifted slowly to approximately 25-30 degrees right bank when the aircraft was in level flight. The result was that the autopilot attempted to follow the failed instrument, placing the aircraft in a left bank. This was not a failure of the AP system but rather, a failure of the sensor feeding data to the system and was comparatively subtle. We were particularly interested in how long pilots took to diagnose the problem. Initial diagnosis (recognition of the general problem) times ranged from 12.7 to 263 seconds (mean = 48.83; median = 34.82). Times to positively identify the failed ADI ranged from 13.83 to 264.6 seconds (mean = 58.79; median= 39.63). Regarding return of the aircraft to level flight, first crossing of zero-degrees bank required an average of 22.11 seconds (median = 21.68). Thus, as would be expected, regaining flight control preceded completed diagnosis. This was aided by the visible, albeit faint, horizon between the cloud layers. Several pilots exhibited persistence of behavior in that they continued to follow the ADI even after an initial leveling in bank. One, in fact, continued to fly in a wide circle until contacted by ATC." (page 3)

Strength:

+1

Aircraft:

unspecified

Equipment:

automation

Source:

Beringer, D.B. (1997). Automation Effects in General Aviation: Pilot Responses to Autopilot Failures and Alarms. In R.S. Jensen & L. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology. Columbus, OH: The Ohio State University. See Resource details

Evidence Type:

Excerpt from Experiment

Evidence:

"When asked to report on the difficulty and ease of diagnosing and recovering from autopilot failures experienced during their experimental session, our subjects unanimously agreed that runaway pitch trim was the most difficult from which to recover. The most difficult failure to diagnose was spilt across three: ADI, pitch sensor, and runaway pitch trim, with each failure receiving 27% of the votes. Pitch sensor was voted the easiest to diagnose by 46% of the subjects, with runaway pitch trim being cited by 36%. Pitch sensor was voted easiest to correct by 56% of the subjects." (page 78)

"Soft pitch (pitch sensor). The pitch-sensor failure caused a slow deviation from level pitch while the ADI continued to show correct pitch indications, simulating loss of sensor data to the autopilot. First response to this failure ranged from 330 msec to 73.7 seconds (mean = 16.62; median = 12.51). AP disconnect times ranged from 5.9 1 to 73.7 seconds (mean =24.8; median = 15.4). Although 60% of the pilots disconnected in less than 20 seconds, 33% fell between 30 and 60 seconds. This was due both to the comparative subtlety of the failure and to the ability of pilots to manually override the pitch servo without disconnecting."(page 77)

"The commanded-roll failure emulated an AP-commanded roll that exceeded the target bank angle. Analyses for both roll malfunctions and the soft-pitch malfunction are based on time from initial failure to disconnect of the AP by any means (yoke-mounted disconnect, panel disengage, circuit breaker). Times ranged from 1.8 sec to 107.1 sec (means, medians, and
ranges are summarized in Table 1). However, 69% of the pilots disconnected within 13 sec of the initial failure and half within 8 sec. These “immediate” disconnects by 18 of the 29 pilots [62%] were defined by sequences in which no other significant actions occurred between failure onset and AP disconnect…Using an RT of 8.7 sec or less as a cutoff value, 93.7% [18 out of 29 pilots or 62 % ] of the sample of immediate responders were included. Eleven pilots [11 out of 29 or 37%] initially chose to manually override the AP prior to their disconnecting the AP, whether by using the control-wheel steering option or by ovirpowering the aileron servo. One extreme outlier was removed, however, reducing the number to 10 [10 out of 29 or 34%] for the examined distribution." (page 160)

"Soft pitch (pitch sensor). The soft pitch failure was rated as most difficult to diagnose (by 12 of 26 pilots [46%]) and was rated third easiest to correct, missing a tie fot second by one tally. Performances were again categorized as either immediate disconnect (12 out [of 29 or 41%]) or manual override (17 [out of 29 or 58%]), ... Three pilots never diagnosed the failures [3 out of 29 or 10%,] manually flying the airplane without disconnecting the AP; their scores and one other outlier were removed, leaving 13. Immediate disconnects averaged 17.7 sec (range = 6.5-3 1 .5), and the 13 remaining manual overrides averaged 46.19 (range = 15.2-76.2)." (page 162)

"Runaway pitch trim. This failure was different from the others in that only by pulling the pitch trim circuit breaker would the problem be corrected. The interim solution was the AP disconnect/trim interrupt switch. Only three pilots chose the optimal response, depressing and holding the disconnect, then pulling the circuit breaker. Four others depressed and held the disconnect at various times during the recovery. The vast majority of initial responses were yoke AP disconnect (15), followed in frequency by panel-mounted AP-engage switch (5), mode manipulation (2). manual override (2), and pitch trim circuit breaker (1). Data from 4 participants were removed from consideration due to circumstances that contaminated these data. Of the 25 remaining, 21 of the pilots were classified as immediate responders, 2 were classified as manual overriders, and 2 as mode changers. It should also be noted that two pilots never heard the warning tone, possibly due to high-frequency hearing loss, responding only to aircraft performance changes." (page 163)

"Soft pitch (pitch sensor). The soft pitch failure was rated as most difficult to diagnose (by 12 of 26 pilots [46%]) and was rated third easiest to correct, missing a tie fot second by one tally. Performances were again categorized as either immediate disconnect (12 out [of 29 or 41%]) or manual override (17 [out of 29 or 58%]), ... Three pilots never diagnosed the failures [3 out of 29 or 10%,] manually flying the airplane without disconnecting the AP; their scores and one other outlier were removed, leaving 13. Immediate disconnects averaged 17.7 sec (range = 6.5-3 1 .5), and the 13 remaining manual overrides averaged 46.19 (range = 15.2-76.2)." (page 163)

"Runaway pitch trim. This failure was different from the others in that only by pulling the pitch trim circuit breaker would the problem be corrected. The interim solution was the AP disconnect/trim interrupt switch. Only three pilots chose the optimal response, depressing and holding the disconnect, then pulling the circuit breaker. Four others depressed and held the disconnect at various times during the recovery. The vast majority of initial responses were yoke AP disconnect (15), followed in frequency by panel-mounted AP-engage switch (5), mode manipulation (2). manual override (2), and pitch trim circuit breaker (1). Data from 4 participants were removed from consideration due to circumstances that contaminated these data. Of the 25 remaining, 21 of the pilots were classified as immediate responders, 2 were classified as manual overriders, and 2 as mode changers. It should also be noted that two pilots never heard the warning tone, possibly due to high-frequency hearing loss, responding only to aircraft performance changes." (page 163)

"The commanded-roll failure emulated an AP-commanded roll that exceeded the target bank angle. Analyses for both roll malfunctions and the soft-pitch malfunction are based on time from initial failure to disconnect of the AP by any means (yoke-mounted disconnect, panel disengage, circuit breaker). Times ranged from 1.8 sec to 107.1 sec (means, medians, and
ranges are summarized in Table 1). However, 69% of the pilots disconnected within 13 sec of the initial failure and half within 8 sec. These “immediate” disconnects by 18 of the 29 pilots [62%] were defined by sequences in which no other significant actions occurred between failure onset and AP disconnect."(page 160)

"The referenced reports often dealt with problems such as altitudes not being captured, crossing restrictions not being met, and climb and descent rates being excessive. As Table 4-3 showed, crossing restrictions not met represent 40% of all flight phase categories in these 99 reports. In many of the reports, an altitude excursion was the result of the FMS not performing as expected, or the flight crew not recognizing that the FMS was not working properly or was mis-programmed. It is likely that many of these incidents occur because the FMS algorithms are designed to level off the aircraft at the last minute. ... The last minute nature of the leveling-off process, coupled with missing the altitude alert cues, means that the crew knows a problem has occurred only when the airplane does not level off, at which time it is probably too late to perform any actions that can prevent the altitude deviation." (page 4.7)

Further examples of both selectivity and confirmation bias can be found in the 1988 Air France crash at Mulhouse-Habsheim (Degani et al 1996), where the pilots continued to believe that they could avert disaster by fighting with the plane’s joystick, despite the fact that their actions were not affecting the flight path as expected and in the China Airlines crash during a descent into Nagoya, where the pilots continued to believe that they could safely land the plane using the joystick, whilst a mistaken engagement of full forward thrust made it practically impossible to do so. Both incidents support the view that knowledge gaps lie behind many examples of automation surprise. (page 8)

"There is a need, therefore, to ensure that warning messages relayed to the crew are optimal in terms of leading crew to the primary cause. Accordingly, the following two statements needed to be adressed:
1. All warnings appropriate to the situation are given....
Responses were generally favorable (M = 2.94, SD = 1.53 for the first statement;…" (page 177)

"After G/S capture, a G/S signal loss was simulated at approximately 3,000 ft. ... Although detection time was not measured for this failure, it was observed that it took some pilots a rather long time (in some cases, several minutes) to even realize the problem although they were looking directly at the ADI (with the G/S indications and FD bars disappearing) during this phase of flight." (page 17)

"Pilots were asked to describe instances where FMS behavior surprised them and to indicate modes/features of FMS operation that they did not understand. There were no sharp boundaries between the incidents elicited by the two questions. Pilot reports are categorized according to their underlying theme." ... There were 3 reports [3 / 135 = 2.2%] in the category: "The effects of partial system failures ... These pilots report that they are unsure of the consequences of partial FMS failures. After such failures, they can not tell which subsystems are still active, which systems are available, or how the failure may interact with the active flight control mode. These reports implicate potential problems with both pilots' mental model of the FMS structure and with the indications of FMS status and behavior." (page 307-313)

"Failure to immediately detect a failure of‘ the flight management and guidance computer: 6 cases. A single flight management and guidance computer (FMGC) failure leads to a loss of redundancy, as input from either MCDU is now sent to the one remaining operational computer, and any entry on one MCDU is transferred to both MCDUs Information can no longer be exchanged and cross-checked between the two FMGSs...The pilot reports in this category refer to situations in which only one or a subset of all possible indications were available. In most cases, the corresponding autopilot and flight director were not engaged. In that case, a single FMGC failure does not involve any of the aural alerts associated with autopilot or flight director disconnect. The lack of such auditory feedback may have contributed to the reported failures to immediately realize the problem."(page 561)