Object Detection in Video

We introduce a framework for refining object detection in video. Our approach extracts contextual information from neighboring frames, generating predictions with state of the art accuracy that are also temporally consistent. Importantly, our model benefits from context frames even when they lack ground truth annotations.