A CNN Cascade for Landmark Guided Semantic Part Segmentation

Abstract

This paper proposes a CNN cascade for semantic part segmentation
guided by pose-specific information encoded in terms of a set of
landmarks (or keypoints). There is large amount of prior work on
each of these tasks separately, yet, to the best of our knowledge,
this is the first time in literature that the interplay between pose
estimation and semantic part segmentation is investigated. To
address this limitation of prior work, in this paper, we propose a
CNN cascade of tasks that firstly performs landmark localisation and
then uses this information as input for guiding semantic part
segmentation. We applied our architecture to the problem of facial
part segmentation and report large performance improvement over the
standard unguided network on the most challenging face
datasets.

Proposed Architecture

Visual Example

Links

[tar.gz]
approx 1GB. This archive contains the models and Python
code to test the network. Some test images are included,
but you are welcome to try your own. This code depends on
several Python packages such as matplotlib, numpy and
scipy. It is unlikely to work correctly on the master
branch of caffe. Please
use caffe-future.tar.gz
which is a clone
of longjon/caffe@future
for safe keeping.

BibTeX

If you find this code or paper useful, please cite in using
the reference below. If you use it for something outside of
academia, I would love to hear how. Please email me to let me
know