The problem of object recognition has not yet been solved in its general form. The most successful approach to it so far relies on object models obtained by training a statistical method on visual features obtained from camera images. The images must necessarily come from huge visual datasets, in order to circumvent all problems related to changing illumination, point of view, etc.
We hereby propose to also consider, in an object model, a simple model of how a human being would grasp that object (its affordance). This knowledge is represented as a function mapping visual features of an object to the kinematic features of a hand while grasping it. The function is practically enforced via regression on a human grasping database.
After describing the database (which is publicly available) and the proposed method, we experimentally evaluate it, showing that a standard object classifier working on both sets of features (visual and motor) has a significantly better recognition rate than that of a visual-only classifier.