In this study, the authors propose an efficient embedded processing architecture that uses the vascular pattern extraction (VPE) algorithm to authenticate a user to an embedded system. This study first considers the use of direction-based vascular pattern extraction (DBVPE), and analyses the computational workload involved in running software implementations on an embedded processor. The authors then present a comprehensive performance analysis of the VPE algorithm and examine in detail the various factors that contribute to processing latencies, including VPE recognition processing. In order to improve the efficiency of VPE processing in embedded devices, the authors offer details regarding the process needed to create a highly efficient application-specific processor and extend the base instruction set of the processor by using custom instructions for recognition processing. The authors implemented our proposed methodology in the context of a commercial extensible processor design flow using the Xtensa platform from Tensilica Inc. Our experiments show that our proposed methodology achieves a 3.95-fold increase in the vascular pattern recognition speed. Hence, the authors consider our technique to be efficient.