This paper presents a high throughput, low latency soft-output signal detector for a 4×4 64-QAM MIMO system. To achieve high data-level parallelism and accurate soft information, the detector adopts a node perturbation technique to generate a list of candidate vectors around Zero Forcing, ZF, result. Additionally a fast and hardware friendly node enumeration scheme is developed to significantly reduce processing delay. Implemented using a 65nm CMOS technology, the detector occupies 0.58mm2 core area with 290K gates. The peak throughput is 3Gb/s at 500 MHz clock frequency with a latency of 20ns. Energy consumption per detected bit is 33pJ.