Design diversity techniques are specifically developed to tolerate design faults in
software arising out of wrong specifications and incorrect coding. Two or more variants
of a software developed by different teams but to a common specification are used.
These variants are then used in a time or space redundant manner to achieve fault tolerance.
Popular techniques which are based on the design diversity concept for fault tolerance in
software are:

N-version programming: First introduced by Avizienis et. al. [2]
in 1977, this concept
is similar to the NMR (N-modular programming) approach in hardware fault tolerance. In this technique, N (N>=2)
independently generated functionally equivalent programs called versions,
are executed
in parallel. A majority voting logic is used to compare the results produced by all the
versions and report one of the results which is presumed correct. The ability to
tolerate faults here depends on how
``independent'' the different versions of the program are. This technique has been applied
to a number of real-life systems like railroad traffic control and flight control, even though
the overhead involved in generating different versions and implementing the voting logic may be
high.

Recovery block: Recovery blocks were first introduced by Horning et. al.
[14]. This scheme is analogous to the cold standby scheme for hardware fault
tolerance. Basically,
in this approach, multiple variants of a software which are functionally equivalent are deployed
in a time redundant fashion. An acceptance test is used to test the validity of the
result produced by the primary version. If the result from the primary version passes the
acceptance test, this result is reported and execution stops. If, on the other hand, the
result from the primary version fails the acceptance test, another version from among the
multiple versions is invoked and the result produced is checked by the acceptance test. The
execution of the structure does not stop until the acceptance test is passed by one of
the multiple versions or until all the versions have been exhausted. The significant
differences in
the recovery block approach from N-version programming are that only one version is
executed at a time and the acceptability of results is decided by a test rather than by
majority voting. The recovery block technique has been applied to real life systems and has been
the basis for the distributed recovery block structure for integrating hardware and software
fault tolerance and the extended distributed recovery block structure for command and control
applications. Modeling and analysis of recovery blocks are desribed by Tomek et al. [28,29].

N-self checking programming: In N-self checking programming, multiple variants
of a software are used in a hot-standby fashion as opposed to the recovery block technique
in which the variants are used in the cold-standby mode. A self-checking software component
is a variant with an acceptance test or a pair of variants with an associated comparison test
[19]. Fault tolerance is achieved by executing more than one self-checking component
in parallel. These components can also be used to tolerate one or more hardware faults.

The design diversity approach was developed mainly to deal with Bohrbugs. It relies on the
assumption of independence of between multiple variants of software. However, as some studies have shown, this assumption may not always be valid. Design diversity can also be used to treat
Heisenbugs. Since there are multiple versions of software operating, it not likely that
all of them will experience the same transient failure. On the disadvantages of design
diversity is the high cost involved in developing multiple variants of software. However,
as we shall see in Section 3.3, there are another approaches which are
more efficient and better suited to deal with Heisenbugs.