Abstract

Similarity search and content-based retrieval have become widely used in multimedia database systems that often manage huge data collections. Unfortunately, many effective content-based similarity models cannot be fully utilized for larger datasets, as they are computationally demanding and require massive parallel processing for both feature extraction and query evaluation tasks. In this work, we address the performance issues of effective similarity models based on feature signatures, where we focus on fast feature extraction from image thumbnails using affordable hardware. More specifically, we propose a multi-GPU implementation that increases the extraction speed by two orders of magnitude with respect to a single-threaded CPU implementation. Since the extraction algorithm is not directly parallelizable, we propose a modification of the algorithm embracing the SIMT execution model. We have experimentally verified that our GPU extractor can be successfully used to index large image datasets comprising millions of images. In order to obtain optimal extraction parameters, we employed the GPU extractor in an extensive empirical investigation of the parameter space. The experimental results are discussed from the perspectives of both performance and similarity precision.

Keywords

GPU Parallel Extraction Feature signature Image indexing

This paper is an extended version of a previous paper by Kruliš et al. [20], which was presented as a work in progress report. We have completed our solution and present the final version in full detail. Furthermore, we have included detailed description of feature extraction process and extensive experimental data, which could help with the selection of optimal parameter configuration for the extractor. The main objective of this paper is to provide full experience and guidelines for anyone, who would adopt this approach on an application level.