Abstract:The impending power wall for general-purpose processors is currently limiting performance and influencing system designers to consider alternative architectures. ASICs, GPUs, and FPGAs provide higher performance and better energy efficiency, but all have high design costs, and GPUs and FPGAs are limited to applications that fit their programming model. Many applications -- including image processing, realtime simulations, and numerical optimization – require high performance but also can tolerate approximations in their computations due to relaxed constraints. In this paper we present EMEURO, a deep-learning based emulation and acceleration platform that utilizes an augmented GPU architecture for efficient neural network (NN) computation. By restructuring algorithms to have the same data flow as a NN, EMEURO is able to achieve significant speedup across several domains with minimal design cost and improved energy efficiency. This paper introduces novel NN compilation techniques, including methods for subroutine modeling, dynamic performance tuning, and fast online retraining on unfamiliar data. We show that EMEURO is able to achieve up to 1,600X speedup over the original algorithm with under 2% approximation error.

Bio:Lawrence is a PhD candidate working with Prof. Kunle Olukotun in the Pervasive Parallelism Lab. He has worked on several projects related to deep learning and acceleration, including FPGA implementations, sparse network compilers, and most recently acceleration of general purpose programs. Neural acceleration is an exciting new area, and one promising direction for simultaneously reducing energy consumption while easing the burden of application optimization.