AI chip said to outperform GPUs

SAN JOSE, Calif. — A startup with ties to Amazon is sampling a 16-nm chip mainly targeted for data centers that it claims handily beats CPUs and GPUs for deep-learning inference jobs. Habana is raising funds to support its production and a roadmap that includes a 16-nm training chip sampling next year as well as follow-on 7-nm products.

The startup is the latest to join a frothy AI sector of as many as 50 companies with some form of machine-learning accelerator. To date, big data centers driving the technology typically run their workloads on the large banks of CPUs and GPUs that they maintain.

The startup’s founders worked together at Prime Sense, which spawned depth-sensing technology that made its way into Microsoft’s Kinect and Apple’s iPhone X. Over their career, the team has worked a total of 20 DSPs.

At the heart of Habana’s Goya inference chip are eight VLIW cores with a homegrown instruction set geared for deep learning and programmable in C. The startup claims that it has a library of 400 kernels that it and subcontractors created for inference tasks across all neural-network types. The chip supports a range of 8- to 32-bit floating-point and integer formats.

The Goya chip can process 15,000 ResNet-50 images/second with 1.3-ms latency at a batch size of 10 while running at 100 W. That compares to 2,657 images/second for an Nvidia V100 and 1,225 for a dual-socket Xeon 8180. At a batch size of one, Goya handles 8,500 ResNet-50 images/second with a 0.27-ms latency.

The startup is keeping details of its architecture under NDA, so blocks are not drawn at correct scale. (Image: Habana)