Title:Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

Abstract: Achieving good speed and accuracy trade-off on target platform is very
important in deploying deep neural networks. Most existing automatic
architecture search approaches only pursue high performance but ignores such an
important factor. In this work, we propose an algorithm "Partial Order Pruning"
to prune architecture search space with partial order assumption, quickly lift
the boundary of speed/accuracy trade-off on target platform, and automatically
search the architecture with the best speed and accuracy trade-off. Our
algorithm explicitly take profile information about the inference speed on
target platform into consideration. With the proposed algorithm, we present
several "Dongfeng" networks that provide high accuracy and fast inference speed
on various application GPU platforms. By further searching decoder
architecture, our DF-Seg real-time segmentation models yields state-of-the-art
speed/accuracy trade-off on both embedded device and high-end GPU.