Reliability Engineering

System performance analysis often demands more than traditional reliability block diagramming (RBD) tools can juggle. It’s critical to take into account the relationship between components. How does the reset of one component affect the whole system? How can you identify potential failures? How can you accurately predict and manage the risks around assets that could fail and cause unnecessary and expensive downtime? How can you resolve uncertainty in multi-stage manufacturing systems?

The Reliability module in ExtendSim Pro is the missing link bridging reliability block diagrams with the pinpoint accuracy of simulation to mimic the behavior of systems using dynamic reliability modeling. Maintenance reliability professionals, asset managers, and predictive maintenance teams are turning to simulating RBDs in ExtendSim to help manage their asset reliability program, reduce rate failures, optimize alternate flow paths, deal with intermediate product storage, and improve the reliability of plant assets.

Procter & Gamble (P&G) has used ExtendSim to "model all product lines - from soap to nuts" for design of equipment and production lines, scheduling, commerce, quality, etc. The models they build are used to interface with engineers who are not necessarily simulation experts, but can use it for analysis and design. Ultimately becoming simulation experts while using it because of ExtendSim's design.

A consultant for a branch of the US Military is using ExtendSim to build inventory and reliability models of the frequency of failure of parts on military aircraft. Once a part has failed, how does it get replaced and will the replacement part be in inventory or not? And if the aircraft is out of service, will there be another one to replace it?

Deployment of digital smart grid sensing, communication, and control technologies that improve electric grid security, reliability, and efficiency is growing exponentially. ExtendSim is being used to dynamically monitor grid operations - identifying appropriate security controls based on parameters and constraints then simulating mission assurance indicators before and after defense actuation to gauge effectiveness.

Dow Chemical Company performs reliability modeling in ExtendSim to identify and understand the impact of different failures on overall production capabilities in chemical plants. The model is used for understanding the key equipment components that contribute towards maximum production loss and for analyzing the impact of change policies, such as the installation of new equipment or an increased stock level for failure-prone components. A Failure Summary Report provides information for further phases of the analysis.

In the first pilot project of its kind, DNV GLfound all possible root causes for critical system failures directly from design documentation. They made a list of cut sets from an ExtendSim model of the signal flow where fault tree is the result, not the input. This project resulted in more reliable while being less expensive safety analysis of the system.

The European railway industry is continuously advancing and in recent years, they have adopted a new system called European Railway Traffic Management System/ European Train Control System (ERTMS/ETCS) for the interoperability of railways among different European nations. Currently, this has been used more extensively for transportation by commuters and for freight. The foremost quality of such transportation system is to operate in a reliable manner and maintain punctuality. In this context, Bane Nor (Norwegian National Rail Administration) is planning to convert the entire conventional signalling system to ERTMS signaling system, as a part of their ERTMS National Implementation project.

ERTMS/ETCS is a complex infrastructure of various systems on trackside, lineside and train onboard and these systems have different sub systems comprising of software, hardware, network and signalling components. Due to its complexity, determining the failures and resolving them is challenging. An existing line operated on ERTMS is taken as case study from Bane NOR for developing a reliability model.

Primarily a reliability block diagram method is used to model the Østfoldbanen Østre Linje (ØØL) ERTMS pilot line as a case study in ExtendSim's Reliability module incorporating a combination of single station and bidirectional (BiDi) sections, then conducting 1000 simulations to assess ØØL ERTMS infrastructure. It is estimated from the results that this model has the potential to determine the performance of the infrastructure, and it is deduced that predominant infrastructure failures that cause delays are due to partial interlocking fail, maintenance and track fracture, followed by failure of balise, axle counters and points.

The frequency of demands are crucial when analyzing a safety instrumented system (SIS). IEC 61508 distinguishes between low and high demand mode when calculating risk for such a system. In reality there are systems that can not clearly be placed in one of the two modes. These types of systems are called intermediate demand mode systems, which are analyzed in this thesis. Not many published SIS reliability studies focus on the problems related to this borderline. Oliveira predicts somewhat strange behavior for the hazard rate in the intermediate demand mode, as well as with a focus on the demand duration.

The results from the analyses of a redundant system show that the standard Probability of Failure on Demand (PFD) formulae are usable for very low demand rates, but become increasingly more conservative as one moves into the intermediate mode, while the Probability of Failure per Hour (PFH) is non-conservative. This can cause major consequences for the operator of a safety system in the sense of not obtaining the optimal testing strategy, or even worse encounter a hazard.

For more complex systems with several components the Markov approach has its limits, choice of distributions and maintenance details are also restricted. Discrete Event simulation can deal with such complex systems, and also the rare event problem that often is a challenge for safety system analysis can be handled satisfactorily.

By use of Harel Statechart and discrete event Monte Carlo simulations for different safety systems, it is shown that the intermediate demand mode is dependent on the relationship between the proof-tests, demands and repair duration. When a demand rate increases to a significant level, demands can be used as tests. With Harel Statecharts we can calculate realistic models that go beyond what a Markov model is capable of.

Uncertainty is an unavoidable issue in software engineering and an important area of investigation. This paper studies the impact of uncertainty on total duration (i.e., make-span) for implementing all features in operational release planning, including:• the number of new features arriving during release construction•the estimated effort needed to implement features• the availability of developers• the productivity of developers.

An integrated method is presented combining Monte-Carlo simulation (to model uncertainty in the operational release planning (ORP) process) with process simulation (to model the ORP process steps and their dependencies as well as an associated optimization heuristic representing an organization-specific staffing policy for make-span minimization). The method allows for evaluating the impact of uncertainty on make-span. The impact of uncertainty factors both in isolation and in combination are studied in three different pessimism levels through comparison with a baseline plan. Initial evaluation of the method is done by an explorative case study at Chartwell Technology Inc. to demonstrate its applicability and its usefulness.

Results. The impact of uncertainty on release make-span increases – both in terms of magnitude and variance – with an increase of pessimism level as well as with an increase of the number of uncertainty factors. Among the four uncertainty factors, we found that the strongest impact stems from the number of new features arriving during release construction. We have also demonstrated that for any combination of uncertainty factors their combined (i.e., simultaneous) impact is bigger than the addition of their individual impacts.

The added value of the presented method is that managers are able to study the impact of uncertainty on existing (i.e., baseline) operational release plans pro-actively.

Procter & Gamble partnered with the Energy Department's Los Alamos National Laboratory (LANL) in the 1990s. LANL scientists helped P&G engineers develop simulations to improve the reliability of P&G's complex production lines. P&G's 150 facilities worldwide saw a 44 percent increase in plant productivity and 30 percent increase in equipment reliability since they started using the software.

The pairing of the lab and corporations' data led to the creation of simulation software called Reliability Technology in 1993. With the software, engineers could configure both the machines and their maintenance schedules based on reliability. In addition, engineers could foresee and possibly avoids product jams, intervals of a component breakage or variations in a machine speeds. In other cases, engineers could triage the production line. Large-scale implementation of the technology helped save P&G $1 billion in manufacturing costs, according to Procter & Gamble. These cost-saving benefits are applicable towards production lines across the manufacturing sector.

ExtendSim SimCast describing the different methods used to model reliability in ExtendSim. It features examples that use both blocks and items to represent failures in the system or process. This SimCast includes a first look at the Reliability module in ExtendSim Pro while it was still early in its development stage.