Multiple View Semantic Segmentation for Street View Images

Abstract

We propose a simple but powerful multi-view semantic
segmentation framework for images captured by a camera
mounted on a car driving along streets. In our approach,
a pair-wise Markov Random Field (MRF) is laid out across
multiple views. Both 2D and 3D features are extracted at
a super-pixel level to train classifiers for the unary data
terms of MRF. For smoothness terms, our approach makes
use of color differences in the same image to identify accurate segmentation boundaries, and dense pixel-to-pixel correspondences to enforce consistency across different views.
To speed up training and to improve the recognition quality,
our approach adaptively selects the most similar training
data for each scene from the label pool. Furthermore, we
also propose a powerful approach within the same framework to enable large-scale labeling in both the 3D space
and 2D images. We demonstrate our approach on more than
10,000 images from Google Maps Street View.

Result

Poster

Source Code Download

ImgAnnotator:
This is a program to use Graph Cut for interactive image segmentation for manual annotation.

RenderMe:
This code is to render a Mesh given a 3x4 camera matrix with an image resolution.

Acknowledgments

This work was supported by Hong Kong RGC
Grants 618908, 619107, 619006, and RGC/NSFC NHKUST602/05. We thank Qiang Bi for labeling some data
and the anonymous reviewers and the area chair for constructive comments that helped to improve this work. The
data set was kindly provided by Google.