Abstract : Deep Convolutional Neural Networks (CNNs) have become a de-facto standard that increased robustness and accuracy of machine vision systems. This accuracy comes at the price of a high computational cost making CNNs too computationally intensive for the limited processing capabilities of a smart camera node. This challenge is addressed by taking advantage of the hardware flexibility of FPGAs on one hand, as well as the large amount of intrinsic parallelism that CNNs exhibit on the other hand. Our approach relies on a direct mapping all the computing elements of a given CNN graph onto hardware physical resources. We demonstrate the feasibility of this so-called Direct Hardware Mapping (DHM) and discuss several associated implementation issues. As a proof of concept, we introduce the HADDOC2 tool, that automatically transforms a high level CNN model into a platform independent hardware description ready for FPGA implementation. The HADDOC2 framework and the library of CNN actors supporting the DHM approach are open-source and made available online.