Two new GPUs for deep learning have just been announced, the Nvidia Tesla P4 and P40. Both are successors to Tesla M4 and M40 but are more powerful. The Pascal-based Tesla P100 came with support for 16-bit (FP16) precision but the Nvidia Tesla P4 and P40 support 8-bit INT8 precision as researchers have now learned that you do not need especially high precision for deep learning training. Out of the two new GPUs, the Nvidia Tesla P4 is the low-end one. It is aimed at scale-out servers that need high-efficiency GPUs. The Nvidia Tesla P4 consumes 50-75W of power and has a peak performance output of 5.5 TeraFLOP/s and 21.8 INT8 TOP/s. In the AlexNet image processing test, the Nvidia Tesla P4 is 40 times more efficient than an Intel Xeon E5 CPU. Tesla P40 performance enhancement is due to the Pascal architecture as well…