TY - JOURCY - New YorkID - 1341689UR - http://dx.doi.org/10.1007/978-3-319-25087-8_22AU - Novák, David - Čech, Jan - Zezula, PavelN2 - We present an efficiency evaluation of similarity search techniques applied on visual features from deep neural networks. Our test collection consists of 20 million 4096-dimensional descriptors (320GB of data). We test approximate k-NN search using several techniques, specifically FLANN library (a popular in-memory implementation of k-d tree forest), M-Index (that uses recursive Voronoi partitioning of a metric space), and PPP-Codes, which work with memory codes of metric objects and use disk storage for candidate refinement. Our evaluation shows that as long as the data fit in main memory, the FLANN and the M-Index have practically the same ratio between precision and response time. The PPP-Codes identify candidate sets ten times smaller then the other techniques and the response times are around 500 ms for the whole 20M dataset stored on the disk. The visual search with this index is available as an online demo application. The collection of 20M descriptors is provided as a public dataset to academic community.L2 - http://dx.doi.org/10.1007/978-3-319-25087-8_22TI - Efficient Image Search with Neural Net FeaturesPY - 2015PB - Springer International PublishingKW - metric indexingKW - deep convolutional neural networkKW - contentbased image retrievalSN - 9783319250861ER -

We present an efficiency evaluation of similarity search techniques applied on visual features from deep neural networks. Our test collection consists of 20 million 4096-dimensional descriptors (320GB of data). We test approximate k-NN search using several techniques, specifically FLANN library (a popular in-memory implementation of k-d tree forest), M-Index (that uses recursive Voronoi partitioning of a metric space), and PPP-Codes, which work with memory codes of metric objects and use disk storage for candidate refinement. Our evaluation shows that as long as the data fit in main memory, the FLANN and the M-Index have practically the same ratio between precision and response time. The PPP-Codes identify candidate sets ten times smaller then the other techniques and the response times are around 500 ms for the whole 20M dataset stored on the disk. The visual search with this index is available as an online demo application. The collection of 20M descriptors is provided as a public dataset to academic community.