<p>Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985). Comparison with previous techniques shows that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures. The gain in speed arises from the faster clock that results from simpler combinational logic.</p>