Problem

The success of convolutional neural networks (ConvNets) has led to impressive performance in a wide range of cloud-centric applications including image classification, speech recognition, and text analysis. To reduce latency and the high energy cost of communication with the cloud, our recent work focuses on the development of highly energy-efficient, edge computing ConvNet ASICs [1-3], with the additional benefit that local computation provides better privacy guarantees per user device [4]. Unfortunately, the performance of ConvNets is often directly linked to the massive number of parameters needed to encode the network and the availability of representative datasets for training. Deployment of ConvNets in resource-constrained Internet of Everything (IoE) systems remains a challenge [5] due to the high memory energy consumption caused by network storage requirements and substantial data movement.

Motivation

Recently, there has been an emergence of interest in the field of approximate computing, which explores trade-offs between the performance of an algorithm and hardware energy consumption with reduced precision. ConvNets are one class of algorithms which have been shown to be inherently error resilient, motivating extensive studies on the effect of noise in ConvNets with the goal of decreasing compute energy through methods of approximate computing [6]. Similarly, we can leverage the error resilience of ConvNets by accepting bit errors at reduced voltages for memory energy savings (approximate memory), but few implementations utilize this due to the limited understanding of how bit errors affect the classification performance of ConvNets.

Approach and Contributions

Motivated by the need to reduce memory energy consumption in hardware ConvNets, and the current lack of understanding of ConvNet tolerance to bit errors, we present the first silicon-validated study on the efficacy of memory voltage scaling in SRAMs on the MNIST and CIFAR-10 datasets [7, 8]. Using a hardware-software co-design approach, we demonstrate that supply voltage in SRAMs for MNIST ConvNets can be scaled well below the Vmin and furthermore, with re-training to account for these SRAM bit errors, we demonstrate additional improvements in classification accuracy and energy savings [7]. We further show that a uniform bit error model is sufficient to achieve classification accuracies very close to training with the physical SRAM in the loop [7].

Using this framework, we extend these methods to a multi-layer binarized ConvNet performing a more complex image classification task (CIFAR-10), demonstrating that significant errors can accumulate in the network with little to no degradation in classification accuracy [8]. Furthermore, we show that additional energy savings are possible by leveraging the different bit error tolerances between weights and activations, and over the different layers of the network [8]. Finally, we compare our required bit error tolerances between our MNIST and CIFAR-10 implementations, demonstrating that the CIFAR-10 network is less error resilient but still tolerates bit error rates significantly higher than conventional memory applications [8]. Our findings and proposed methods serve as a framework which can be applied to the design of custom memory (e.g. hybrid 8T/6T, larger bitcells) and emerging memory technologies (e.g. RRAM, PCM) for ConvNet applications [9].