The iterative exchange of extrinsic information between the K-best sphere detector (SD) and the channel decoder is appealing since it is capable of achieving a near maximum a posteriori (MAP) performance at a moderate complexity. However, the computational complexity imposed by the K-best SD significantly increases when using a large value of K for the sake of maintaining a near-MAP performance in a high-throughput uplink spatial-division multiple access (SDMA) orthogonal frequency-division multiplexing (OFDM) system supporting a large number of users and/or a high number of bits/symbol. This problem is further aggravated when the number of users/mobile stations (MSs) U exceeds that of the receive antennas N at the base station (BS), namely, in the challenging scenario of rank-deficient systems. We demonstrate that the iterative decoding convergence of this two-stage system may be improved by incorporating a unity rate code (URC) having an infinite impulse response, which improves the efficiency of the extrinsic information exchange. Although this results in a slightly more complex three-stage system architecture, it allows us to use a low-complexity SD having a significantly reduced candidate list size Ncand. Alternatively, a reduced signal-to-noise ratio (SNR) is required. For example, given a target bit error ratio (BER) of 10-5 and Ncand = 32 for the SD, the three-stage receiver is capable of achieving a performance gain of 2.5 dB over its two-stage counterpart in a rank-deficient SDMA/OFDM 4-quadratic-amplitude modulation (4-QAM) system supporting U = 8 cochannel users and employing N = 4 receive antennas at the BS, namely, in an (8 x 4) system. For the sake of further enhancing the three-stage concatenated receiver, the proposed iterative center-shifting SD scheme and the irregular convolutional codes (IrCCs) are intrinsically amalgamated, which leads to an additional performance gain of 2 dB.