A Method for Estimating Bounds on System Reliability in the Absence of Times-to-Failure Data

A common method for obtaining confidence bounds on a system uses the variance of reliability of each component in the system along with the
system reliability equation. This analytical method (discussed in the August
2009 issue of Reliability HotWire) requires that you have failure data for each component in the system.
If times-to-failure data are not available for all components (e.g., the reliability information is obtained from a supplier who will share the
distribution parameters, but not the actual failure data), some reliability engineers use a worst case/best case method based on the lower or upper
confidence bounds for the parameters of the components. However, when data are available to compare the results, this approach provides estimates that
are quite far from the analytical solution.

This article introduces a simulation method of obtaining approximate bounds on system reliability that does not require data for each component, but
provides results that are closer to the analytical solution.

Analytical Method

Jacob started with a set of times to failure data for a specific component. In Weibull++, he
fitted a 2-parameter Weibull model to the data and created a plot with 90% 2-sided confidence bounds as shown in Figure 1.

Jacob then published the model to make it available in BlockSim, as shown in Figure 2.

Figure 2: Published model of times-to-failure data

Now suppose that Jacob wanted to compute 90% 2-sided confidence bounds for a system of 10 of these components arranged in a parallel configuration.
He created the block diagram shown in Figure 3 and assigned the Universal Reliability Definition (URD) shown in Figure 4 to each component block.

Worst Case/Best Case Method

Jacob had heard about other reliability engineers who fit distributions to only the lower and upper bounds of each component and used those
bounds as the failure models in the RBD in order to obtain the system-level bounds. He decided to try this method with his current data set by doing the following:

Create a table of bounds on unreliability versus time in Weibull++.

Copy the data for each bound and the corresponding time to a free-form folio, then use the Distribution Wizard to determine the model and parameters
that best fit the bounds data. In general, the distribution that fits the bounds best will not be the same as the distribution that fits the
times-to-failure data. In other words, if the times-to-failure data follow a Weibull distribution, the median estimate will follow a Weibull
distribution but it is very likely that the bounds will not follow a Weibull distribution, as shown in Table 1.

Table 1. Distribution and parameters for models fit to median estimate and confidence bounds of component data set

Unreliability

5%

50%

95%

Model

Generalized gamma

Weibull

Generalized gamma

Parameter 1

μ = 7.57

β = 2.10

μ = 7.43

Parameter 2

σ = 0.500

η = 1819

σ = 0.477

Parameter 3

λ = 0.671

N/A

λ = 1.32

Use an overlay plop to compare the actual reliability versus the time plot with bounds to the fitted distributions. Figure 6 shows that in this case the
original and fitted data were virtually indistinguishable.

Publish the fitted models for the upper and lower bounds to make them available to use in BlockSim.

Duplicate the block diagram used in the analytical method twice. In the first diagram, assign all blocks a single URD using
the lower bound as the failure model; in the second diagram, assign all blocks a single URD using the upper bound as the failure model.

Figure 7 shows how Jacob plotted the best and worst case bounds along with the analytical solution and confidence bounds. He observed that the worst
case and best case solutions are more than twice the distance from the analytical solution as the analytical confidence bounds. He decided to try a different approach.

Simulation Method

In order to more closely approximate the analytical confidence bounds, Jacob considered what system confidence bounds really mean. They are a measure of
the variability of the reliability of the system that take into account the variability of each input block. The worst case (or best case) method assumes that
every block in the system has lower (or higher) than average reliability, while the analytical method assumes that the system has a mixture of lower than
average reliability components, average reliability components, and higher than average reliability components. So Jacob concluded that in order to more
closely approximate the analytical method, he must incorporate some cases where all components have lower than average reliability, some cases where
all components have higher than average reliability, and many cases where there are both lower and higher than average reliability components.

Jacob started by modeling the variability of the Weibull parameters of the component failure distribution by doing the following:

Create the table of estimates of beta and eta at different percentiles given in Table 2.

Table 2. Data used to describe variability of model parameters

Percentile

Beta

Eta

10

1.81

1658

30

1.97

1752

50

2.10

1819

70

2.23

1889

90

2.43

1995

Use a free-form folio to fit a separate distribution for each parameter to obtain the parameter models given in Table 3.

Table 3. Distributions of component parameters

Beta

Eta

Model

lognormal

lognormal

In-mean

0.740

7.51

In-standard deviation

0.115

0.0721

Publish the model for each parameter for use in the subsequent RENO simulation.

Jacob created a RENO simulation that will compute system reliability from samples of the component parameter distributions
by doing the following:

Create the appropriate variables (e.g., time values, reliability values and other variables to control the flow of the RENO simulation), as shown in Figure 8:

Figure 8: Variables used in RENO simulation

Create an analytical RBD with 10 components in parallel. Figure 9 shows how he gave each component a separate URD with a dynamic failure model that
uses the value contained in one of the static reliability variables created in the previous step.

Figure 9: RBD called by RENO simulation

Publish the analytical RBD model to use in the RENO simulation.

Create a Synthesis Workbook to store results.

Create a RENO flowchart with an inner loop for computing system reliabilities at a given time and an outer loop for generating
time values to be used in the inner loop.

Each pass through the inner loop:

Computes a static reliability value for each component from pairs of parameters randomly chosen from the distributions of beta and eta as shown in Figure 10.

Figure 10: Block properties for the first component showing the equation used to generate the static reliability value used in the RBD

Calculates and stores the system reliability value calculated using the published RBD model with the component static reliability values, as shown in Figure 11.

Figure 11: Block properties for computing the system reliability. The upper left window shows the call to the RBD to obtain the system unreliability. The lower right window calculates the system reliability.

Each pass through the outer loop (shown in blue) computes and stores a time value for use in the inner loop (shown in yellow).

Figure 12: RENO flowchart. The inner loop for calculating system reliability for multiple combinations of parameters at a particular time
is in yellow and the outer loop for generating time values to be used in the inner loop is in blue.

For each time value stored in the workbook, sort the reliability values. For this scenario, 99 reliability values are computed at each time step,
so the 5th largest value roughly corresponds to the lower bound and the 95th largest value corresponds to the upper bound, as shown in Figure 13.

Figure 13: RENO simulation results sorted in ascending order for each time step. The lower bound of system reliability is shown in bold. (Note that only the first 20 percentiles and the first 15 time steps are shown.)

In order to compare the three methods, Figure 14 shows that Jacob plotted the results on a single graph. The plot shows that the simulation method
results in a reliability estimate and bounds that are much closer to the analytical solution than the worst case / best case bounds. At larger time
values, the simulation median estimate and bounds are greater than the analytical solution, but the distance between the bounds and the median estimate is preserved.

The simulation method is a reasonable approximation to the analytical method for diagrams with few blocks that have
reliability equations that are sums of products of component reliabilities. However, the simulation method should not be used for diagrams with
complex dependencies, such as load share or standby redundancy, as it will give incorrect results in these scenarios.

Conclusion

This article presented a method for obtaining approximate confidence bounds for the scenario where times-to-failure data is unavailable.
The method compared favorably to calculating confidence bounds on system reliability analytically. It can provide a reasonable estimate of the
variability of system reliability when sufficient information to build an analytical model is not available.

Appendix

For the purpose of comparing the simulation method to the analytical method, we used the variances of the parameters computed in Weibull++ to
obtain distributions to describe each parameter; however, we could just as easily start from rough estimates of the variability of the parameters
of the failure time distribution. For example, Figure 15 shows how we could use as little information as the median estimate of beta provided by the supplier (e.g., 2.1)
and an expert opinion on the 90th percentile value of beta (e.g., 2.4) with the Quick Parameter Estimator to get a rough estimate of the
variability of beta to use in place of the model fit to the various percentiles of beta.

Figure 15: Use of Quick Parameter Estimator to define distributions of component parameters in the absence of times-to-failure data