Parameter Encoding on FPGAs Boosts Neural Network Efficiency

July 10, 2017 Nicole Hemsoth The key to creating more efficient neural network models is rooted in trimming and refining the many parameters in deep learning models without losing accuracy. Much of this work is happening on the software side, but devices like FPGAs that can be tuned for trimmed parameters are offering promising early results for implementation. A team from UC San Diego has created a reconfigurable clustering approach to deep neural networks that encodes the parameters the network according the accuracy requirements and limitations of the platform—which are often bound by memory access bandwidth. Encoding the trimmed parameters in an FPGA resulted in a 9X reduction in memory footprint and a 15X boost in throughput while maintaining accuracy. “Neural networks are inherently approximate models and can often be simplified. There is a significant redundancy among neural network parameters, offering alternative models that can deliver the same accuracy with less computation or storage requirement.” Much of the work on FPGAs for deep learning has focused on the implementation of existing models rather than modifying the models to tailor them for FPGA platforms. This work focuses on some of the key challenges of using FPGAs for deep learning, including the…