Adaptive Relevance Feedback for Large-scale Image Retrieval

Our research addresses the need for an efficient, effective, and interactive access to large-scale image collections. Image retrieval needs are evolving beyond the capabilities of the traditional indexing based on manual annotation, and the most desirable characteristic of any image retrieval system is to be able to deal with automatically-extracted visual indexing features, while providing an intuitive and simple interaction with users. In this thesis, we investigate an innovative query-free retrieval approach that was proposed by Ferecatu and Geman. Starting from an heuristic sampling of the collection, this approach does not require any explicit query, neither keywords nor image-examples. It relies solely on an iterative relevance feedback mechanism driven by the user's subjective judgments of image similarities. At each iteration, the system displays a small set of images and the user is asked to choose the image that best matches in her opinion what she is searching for. The system updates an internal state based on automatically-extracted indexing features, and it displays a new set of images accordingly. The idea is that the system converges towards what the user is searching for, and iteratively it displays more and more relevant images. Our contributions are related to four complementary aspects of the iterative relevance feedback mechanism. First, we formalize a large-scale approach based on a hierarchical tree-like organization of the images computed off-line. Second, we propose a versatile modulation of the exploration/exploitation trade-off based on the consistency of the system internal states between successive iterations. Third, we elaborate a long-term optimization of the similarity metric based on the user searching session logs accumulated off-line. Forth, we propose a dynamic short-term adaptation of the similarity metric based on the relevance feedback events accumulated on-the-fly at each iteration. Furthermore, we round up our research by integrating all our contributions together into one comprehensive retrieval system. Experimental validation was carried out by implementing a web-application which includes all our contributions. This software is distributed to the public under the AGPL Version 3 open-source license. We carried out plenty of user-based evaluation campaigns, and we analyzed systematically all our contributions. We show empirically that each of them improves significantly the retrieval performance of the original framework. Moreover, we show that they are complementary to each other, and their overall integration is consistently beneficial. We foresee that our contributions, along with our open-source web-application, will motivate further investigations and facilitate further experiments. We hope that our research brings the iterative relevance feedback mechanism one step closer to commercial applications.