Discriminative Segment Annotation in Weakly Labeled Video

Venue

Publication Year

Authors

BibTeX

Abstract

paper tackles the problem of segment annotation in complex Internet videos. Given a
weakly labeled video, we automatically generate spatiotemporal masks for each of
the concepts with which it is labeled. This is a particularly relevant problem in
the video domain, as large numbers of YouTube videos are now available, tagged with
the visual concepts that they contain. Given such weakly labeled videos, we focus
on the problem of spatiotemporal segment classification. We propose a
straightforward algorithm, CRANE, that utilizes large amounts of weakly labeled
video to rank spatiotemporal segments by the likelihood that they correspond to a
given visual concept. We make publicly available segment-level annotations for a
subset of the Prest et al. dataset and show convincing results. We also show
state-of-the-art results on Hartmann et al.'s more difficult, large-scale object
segmentation dataset.