Patch-based appearance models are used in a wide range of computer vision
applications. To learn such models it has previously been necessary to specify a
suitable set of patch sizes and shapes by hand. In the jigsaw model presented
here, the shape, size and appearance of patches are learned automatically from
the repeated structures in a set of training images. By learning such irregularly
shaped ‘jigsaw pieces’, we are able to discover both the shape and the appearance
of object parts without supervision. When applied to face images, for example,
the learned jigsaw pieces are surprisingly strongly associated with face parts of
different shapes and scales such as eyes, noses, eyebrows and cheeks, to name a
few. We conclude that learning the shape of the patch not only improves the
accuracy of appearance-based part detection but also allows for shape-based part
detection. This enables parts of similar appearance but different shapes to be
distinguished; for example, while foreheads and cheeks are both skin colored,
they have markedly different shapes.