Abstract

A simulation environment that allows the run-time injection of transient and permanent faults and the assessment of their impact in complex systems is described. The error data from the simulation are automatically fed into the analysis software in order to quantify the fault-tolerance of the system under test. The features of the environment are illustrated with case study of a fault-tolerant, dual-configuration real-time jet engine controller. The entire controller, described at the logic and functional levels, is simulated, and transient fault injections are performed. In the controller, fault detection and reconfiguration are performed by transactions over the communication links. The simulation consists of the instructions specifically designed to exercise this cross-channel communication. The level of effectiveness of the dual configuration of the system to single and multiple transient errors is measured. The results are used to identify critical design aspects from a fault-tolerance viewpoint.

title = "FOCUS: An experimental environment for validation of fault-tolerant systems - Case study of a jet-engine controller",

abstract = "A simulation environment that allows the run-time injection of transient and permanent faults and the assessment of their impact in complex systems is described. The error data from the simulation are automatically fed into the analysis software in order to quantify the fault-tolerance of the system under test. The features of the environment are illustrated with case study of a fault-tolerant, dual-configuration real-time jet engine controller. The entire controller, described at the logic and functional levels, is simulated, and transient fault injections are performed. In the controller, fault detection and reconfiguration are performed by transactions over the communication links. The simulation consists of the instructions specifically designed to exercise this cross-channel communication. The level of effectiveness of the dual configuration of the system to single and multiple transient errors is measured. The results are used to identify critical design aspects from a fault-tolerance viewpoint.",

N2 - A simulation environment that allows the run-time injection of transient and permanent faults and the assessment of their impact in complex systems is described. The error data from the simulation are automatically fed into the analysis software in order to quantify the fault-tolerance of the system under test. The features of the environment are illustrated with case study of a fault-tolerant, dual-configuration real-time jet engine controller. The entire controller, described at the logic and functional levels, is simulated, and transient fault injections are performed. In the controller, fault detection and reconfiguration are performed by transactions over the communication links. The simulation consists of the instructions specifically designed to exercise this cross-channel communication. The level of effectiveness of the dual configuration of the system to single and multiple transient errors is measured. The results are used to identify critical design aspects from a fault-tolerance viewpoint.

AB - A simulation environment that allows the run-time injection of transient and permanent faults and the assessment of their impact in complex systems is described. The error data from the simulation are automatically fed into the analysis software in order to quantify the fault-tolerance of the system under test. The features of the environment are illustrated with case study of a fault-tolerant, dual-configuration real-time jet engine controller. The entire controller, described at the logic and functional levels, is simulated, and transient fault injections are performed. In the controller, fault detection and reconfiguration are performed by transactions over the communication links. The simulation consists of the instructions specifically designed to exercise this cross-channel communication. The level of effectiveness of the dual configuration of the system to single and multiple transient errors is measured. The results are used to identify critical design aspects from a fault-tolerance viewpoint.