Every set is not a system. In order to be a system, a set needs a sense of unity, functional relationships between its components, and/or some useful purpose. For example, a random group of items in a room would not be a system unless one of the above conditions are met.

Components are interrelated and work together toward some purpose, objective, or function. The properties and behavior of each component affect the properties of the system as a whole. For example, the speed of computer memory, the disk access time, and its capacity will all affect the overall speed of a computer. The properties of each component depend on at least one other component. For example, memory performance depends on bus speed (bandwidth). Each subset (or subsystem) of components are related in the same manner, but the system cannot be divided into independent subsets.

Often, a system has a hierarchy of components. A system is made up of components, and those components are made up of smaller components. The lower hierarchical levels are called subsystems. One example is a hard disk drive. The drive is a component of a computer, but it has multiple platters, a read/write head, a buffer, and many more smaller components.

Engineering is concerned with the economical use of limited resources in order to benefit people. This is accomplished by approaching a problem with several things in mind. In the domain of systems engineering, it is necessary to define product and system requirements as they relate to true customer needs. For example, designing an email system to meet a customer’s communication needs must be well-defined to meet those needs. Engineering also must address total systems, with all elements, from a life-cycle perspective. The overall hierarchy must be considered, including the interactions between various levels and elements at the same level. An example of this in a computer system is the memory hierarchy, composed of a 2-level cache, main memory, and virtual memory on a hard disk. It is often necessary to organize various related disciplines into one engineering effort in a timely, concurrent manner, such as separate mechanical and electrical aspects of a system. Finally, it is vital to establish a disciplined approach to a process (manage a process to get results). This includes appropriate review, evaluation, and feedback to ensure orderly and efficient progress.

An example of this process in application is as follows. Dictators in third-world countries often want to ride around in fancy cars. However, there is not much support for this preference. Filling stations are not very ubiquitous, and the economy may not support many trained mechanics for automotive repairs. So, from an engineering standpoint, this system would require much more design and money to make it viable.

1. Design, development production/construction, distribution, operation, maintenance &
support, retirement, phaseout, disposal
2. Past emphasis on design & acquisition, with little emphasis on production, operation,
maintenance, support & disposal
3. Example: If an old computer goes to a landfill (taking up space and polluting the
groundwater), a better design would allow the recovery of gold, lead, and other materials
upon disposal.

Better definition of system requirements - Trace down customer needs to individual components

Interdisciplinary

1. Systems usually require multiple disciplines
2. Example: In the development of a computer game, a company has 3 employees – an artist,
a musician, & a programmer.

The shape of the hazard function indicates how an item ages. It has an intuitive interpretation as the amount of risk an item is subject to at a time t:

Increasing Hazard Function This is probably the most likely situation, because items wear out or degrade with time. For example, look at mechanical items that undergo wear or fatigue, such as the rubber getting thinner on car tires over time.

Decreasing Hazard Function In this situation, an item improves; that is, an item is less likely to fail as time passes. For example, some metals “work-harden” through continued use. Also, software may improve as bugs are removed.

Bathtub Shaped Failure Rate This situation describes many natural systems and manufactured goods. It is a composite of 3 effects:

*early failures due to defects
*late failures due to wear out
*accidents at a constant rate

This is the conditional probability that a failure distribution for an item that has survived to time s is identical to a brand new item.

One example of this is a fuse. A fuse fails due to a power surge, but does not weaken or degrade over time. The memoryless property, with its used-as-good-as-new assumption, is restricted in applicability. An exponential distribution is easily misapplied for the sake of simplicity:

*statistical techniques are particularly tractable
*can add failure rates
*field data often allow an estimation of only this one-parameter distribution

Waloddi Weibull, a Swedish physicist, introduced this distribution in 1939. It is a generalization of an exponential distribution suitable for modeling lifetimes having constant, strictly increasing, and strictly decreasing hazard functions.

Note that the Weibull Distribution can match different phases of the bathtub curve.

Procedure: 1. Collect the failure data. 2. Get the best fit for the data to a Weibull distribution:

If item is still in the burn-in phase

*Improve supplier quality
*Burn in the system longer
*Be more careful while manufacturing

At GE, light bulbs with as little as a 1% variation in their filaments lead to a 25% shorter lifespan.

The Parts Count reliability model assumes that the system is in series; this model underestimates the reliability of redundant systems. For redundant systems, the Parts Count model is used to estimate the reliability of the series subsystems and interfaces. Reliability is then computed while considering the redundancy structure.

Using our AM Signal Pickup example again:

Ground Mobile environment (GM)

Series Subsystem:

Interface :

System Reliability Estimate:

Simplex System:

Simple Redundant System (ignoring interface problem):

R = .9876

Note: In some cases the interface reliability may dominate the redundant subsystem reliability and determine the overall system reliability. In this case the simplex system may be more reliable than the redundant system.

The Voter compares the outputs of all N modules and outputs the majority. This is called N Modular Redundancy (NMR). The NMR system will generally have an odd number of modules, so . The system works if (n+1) modules are working (it can have up to n failures), and if the voter is working.

Simple Voter

Analog Signal or Numeric Voting

The voter compares input signals (or numeric values) and picks the middle value as its output. Normal operation is as follows:

However, error conditions may arise:

Note: Reliability calculations assume the worst case conditions:

All modules fail in the same logical direction

There are no compensating failures (i.e. one module becomes stuck at 1, while another is stuck at 0)

Failed modules accumulate in an NMR system until they become the majority and the system fails. The system life can be extended by purging all of the failed modules. This can be accomplished through Hybrid Redundancy (using spares), or through Adaptive Voting (also called Change Voting). In essence, the failed module(s) must be detected first.

This system has the following attributes:

N+S modules (S spares)

Disagreement Detector compares the voted output with the module outputs

Switch selects the outputs from N modules to give to the voter

If a module fails, the Disagreement Detector tells the switch to replace the failed module with a spare one

This configuration is often used with TMR systems. If more than a few spares are switched, the complexity increases to a point where its reliability dominates the system reliability.

Say we have 3 programmers write code and then vote on the results. In a TMR system, each program could execute on a completely different set of hardware. However, software is labor-intensive and very expensive to produce. N-ary programming significantly increases this cost, does not protect against specification errors, and introduces timing and coordination problems since each of the programs is not identical to the others.

In adaptive voting, the voted output is compared with the module outputs. When a module fails, it is removed along with one other module (this is to keep an odd number of modules). The voter is then changed to select the majority of the remaining modules. This approach can be combined with hybrid redundancy in order to switch good modules back in. Voting (particularly TMR) is used in many fault-tolerant, very-high-reliability computer systems.

In general, A and B can be different (i.e. A can be an on-line power source while B can be a generator). It should be noted that B can fail while in its standby mode, or the switch could fail. Examine the following simple case. Assume:

Reliability is as follows:

Recall that the above is the law of total probability.

Therefore,

A sequence of failures forms a process that starts over each time a device fails and a new one is switched in. This is called a renewal process. The time between failures is exponentially distributed, where X is a random variable denoting the time between failures. Suppose we have n systems as follows:

Recall that for a Poisson process, i) Events in non-overlapping intervals are independent ii) P(event in small interval h) = P(no event in h) = iii) The time between events, X, is exponentially distributed, iv) The number of events in an interval T,n(T) has the Poisson distribution .

Also recall (for iii) that

Furthermore,

As you might have deduced, this sequence of failures is a Poisson process. Therefore,