Authors

Document Type

Article

Department

Mathematics (HMC)

Publication Date

1988

Abstract

Achieving processor cooperation in the presence of faults is a major problem in distributed systems. Popular paradigms such as Byzantine agreement have been studied principally in the context of a complete network. Indeed, Dolev [J. Algorithms, 3 (1982), pp. 14–30] and Hadzilacos [Issues of Fault Tolerance in Concurrent Computations, Ph.D. thesis, Harvard University, Cambridge, MA, 1984] have shown that Ω(t) connectivity is necessary if the requirement is that all nonfaulty processors decide unanimously, where t is the number of faults to be tolerated. We believe that in forseeable technologies the number of faults will grow with the size of the network while the degree will remain practically fixed. We therefore raise the question whether it is possible to avoid the connectivity requirements by slightly lowering our expectations.

In many practical situations we may be willing to “lose” some correct processors and settle for cooperation between the vast majority of the processors. Thus motivated, we present a general simulation technique by which vertices (processors) in almost any network of bounded degree can simulate an algorithm designed for the complete network. The simulation has the property that although some correct processors may be cut off from the majority of the network by faulty processors, the vast majority of the correct processors will be able to communicate among themselves undisturbed by the (arbitrary) behavior of the faulty nodes.

We define a new paradigm for distributed computing, almost-everywhere agreement, in which we require only that almost all correct processors reach consensus. Unlike the traditional Byzantine agreement problem, almost-everywhere agreement can be solved on networks of bounded degree. Specifically, we can simulate any sufficiently resilient Byzantine agreement algorithm on a network of bounded degree using our communication scheme described above. Although we “lose” some correct processors, effectively treating them as faulty, the vast majority of correct processors decide on a common value.