Deep learning using convolutional neural networks (CNNs) is quickly becoming
the state-of-the-art for challenging computer vision applications. However,
deep learning's power consumption and bandwidth requirements currently limit
its application in embedded and mobile systems with tight energy budgets. In
this paper, we explore the energy savings of optically computing the first
layer of CNNs. To do so, we utilize bio-inspired Angle Sensitive Pixels (ASPs),
custom CMOS diffractive image sensors which act similar to Gabor filter banks
in the V1 layer of the human visual cortex. ASPs replace both image sensing and
the first layer of a conventional CNN by directly performing optical edge
filtering, saving sensing energy, data bandwidth, and CNN FLOPS to compute. Our
experimental results (both on synthetic data and a hardware prototype) for a
variety of vision tasks such as digit recognition, object recognition, and face
identification demonstrate 97% reduction in image sensor power consumption and
90% reduction in data bandwidth from sensor to CPU, while achieving similar
performance compared to traditional deep learning pipelines.