WhittleSearch: Image Search with Relative Attribute Feedback

Adriana Kovashka, Devi Parikh, and Kristen Grauman

Note: We have a patent pending for this work.
We also have a demo for this project.

Abstract

We propose a novel mode of feedback for image search,
where a user describes which properties of exemplar images
should be adjusted in order to more closely match his/her
mental model of the image(s) sought. For example, perusing
image results for a query "black shoes", the user might
state, "Show me shoe images like these, but sportier." Offline,
our approach first learns a set of ranking functions,
each of which predicts the relative strength of a nameable
attribute in an image ('sportiness', 'furriness', etc.). At
query time, the system presents an initial set of reference
images, and the user selects among them to provide relative
attribute feedback. Using the resulting constraints in
the multi-dimensional attribute space, our method updates
its relevance function and re-ranks the pool of images. This
procedure iterates using the accumulated constraints until
the top ranked images are acceptably close to the user.s envisioned
target. In this way, our approach allows a user to
efficiently "whittle away" irrelevant portions of the visual
feature space, using semantic language to precisely communicate
her preferences to the system. We demonstrate the
technique for refining image search for people, products,
and scenes, and show it outperforms traditional binary relevance
feedback in terms of search speed and accuracy.

Introduction

Existing image search methods rely either on keywords or on content-based retrieval. Keywords are not enough-- we cannot pre-tag all images the user might like to search with keywords that match whatever query the user might come up with. On the other hand, content-based image retrieval is limited by the well-known "semantic gap" between low-level cues and the higher-level user intent. User feedback can help, but current methods provide a very narrow channel for human feedback -- it is not clear what about the marked images is relevant or irrelevant.

Therefore, we propose to use relative attributes for more specific user feedback in image search. We allow user to describe precisely what is missing from current set of results. The user expresses the semantics of their search goal through relative attributes, relating their target and some pre-selected exemplar images. Note that attributes (or "concepts" in the information retrieval community) have been used previously for search, but users have not been allowed to isolate individual attributes as a handle for feedback.

Background: Binary Relevance Feedback

User marks some images as relevant or irrelevant given their search target.

We learn a binary classifier using the relevant images as positives and the irrelevant images as negatives, and rank images in the dataset by the classifier outputs.

Relative Attribute Feedback

Learning to predict relative attributes

We learn relative attributes as in Parikh and Grauman, "Relative Attributes", ICCV 2011.

Interface for image-level relative attribute annotations.

Obtain ordered image pairs and unordered pairs such that (image i has stronger presence of attribute m than image j) and (the images have equivalent presence of attribute m).

Learn a ranking function such that as many of the following constraints are satisfied as possible: .

For the latter, use the formulation due to Thorsten Joachims, "Optimizing Search Engines Using Clickthrough Data", KDD 2002: minimize subject to:

Updating the scoring function from feedback

User selects some images and marks how they differ from the image want, thus defining constraints: "I want [objects] that are [more/less] [attribute name] than this image."

We update the scores for each image in the dataset, using these constraints.

For a constraint of the type: "I want images exhibiting more of attribute m than reference image , the scoring function should satisfy:

For a constraint of the type: "I want images exhibiting less of attribute m than reference image , the scoring function should satisfy:

For constraints of the type: "I want images that are similar in terms of attribute m to reference image , the scoring function should satisfy:

At the top we rank images whose score , i.e. images that satisfy all F constraints given so far. Next are images which satisfy F-1 constraints, etc.

A toy example illustrating the intersection of relative
constraints with M = 2 attributes. The images are plotted on
the axes for both attributes. The space of images that satisfy each
constraint are marked in a different color. The region satisfying all
constraints is marked with a black dashed line. In this case, there
is only one image in it (outlined in black).

Note that our method allows user to refine their query, in a way that a query stated in absolute attribute terms cannot.

Our method is efficient, since it only involves set-logic operations and no learning.

Hybrid Feedback Approach

Using images marked as positive (), ones marked as negative (), and our the sets of images which satisfy k constraints, we can define a set which includes all relevance preferences:
and a set which expresses equivalent relevance:

Then we can learn a relevance ranking function.

Experimental Results

Experimental Design

Datasets

Shoes -- 14,658 shoe images from the Attribute Discovery dataset, augmented with 10 attributes: pointy at the front, open, bright in color, high at the heel, covered with ornaments, shiny, long on the leg, formal, sporty, feminine -- dataset and instance-level relative annotations can be downloaded here

Absolute vs. relative

Conclusion

We proposed a method that allows user to communicate very precisely how the retrieved results compare with their mental model. Our method refines results more effectively, often with less human effort.

Note: Please send a blank email to [adriana AT cs DOT utexas DOT edu] with "WhittleSearch Shoes dataset" in the subject line so we know who downloaded our dataset, and also so we can let you know of any updates to this dataset. We will also post any updates on this page.

Note: The full OSR and PubFig datasets from the "Relative Attributes" paper can be found here.