Processors execute a program's full dynamic instruction stream to arrive at its final output, yet there exist shorter instruction streams that produce the same overall effect. This thesis proposes creating a shorter but otherwise equivalent version of the original program by removing ineffectual computation and computation related to highly-predictable control flow. The shortened program is run concurrently with and slightly ahead of a full copy of the program on a chip multiprocessor (CMP) or simultaneous multithreading (SMT) processor. The leading program passes all of its control-flow and data-flow outcomes to the trailing program for checking. This redundant program arrangement provides two key benefits.
1) Improved single-program performance. The leading program is sped up because it retires fewer instructions. Although the number of retired instructions is not reduced in the trailing program, it fetches and executes instructions more efficiently by virtue of having near-oracle branch and value predictions from the leading program. Thus, the trailing program is also sped up in the wake or 'slipstream' of the leading program, at the same time validating the speculative leading program and redirecting it as needed. Slipstream execution using two processors of a CMP substrate outperforms conventional non-redundant execution using only one of the processors. Likewise, given a sufficiently reduced leading program, slipstream execution using two contexts of an SMT substrate outperforms conventional non-redundant execution using only one of the contexts.
2) Fault tolerance. The shorter program is a subset of the full program and this partial redundancy is exploited for detecting and recovering from transient hardware faults. This does not require any additional hardware support, since the same mechanisms used to detect and recover from misspeculation in the leading program apply equally well to transient fault detection and recovery. In fact, there is no way to distinguish between misspeculation and faults.
The broader rationale for slipstream is extending, not replacing, the capabilities of CMP/SMT processors, providing additional modes of execution. This thesis demonstrates the feasibility and benefits of the slipstream execution model.