In this paper, we present an alternative framework for video processing without explicit motion estimation or segmentation. Motivated by the geometric constraint of motion trajectory, we propose an adaptive filtering-based model for video signals in which filter coefficients are locally estimated by the least-square method. Such localized estimation can be viewed as an implicit approach of exploiting motion-related temporal dependency. We also introduce the the concept of a virtual camera to further improve the modeling capability by exploiting the fundamental tradeoff between space and time. Using mixture models, we show how to probabilistically fuse the inference results obtained from virtual cameras in order to achieve spatio-temporal adaptation. Implicit and mixture motion model supplements the existing paradigm and provides a unified solution to a wide range of low-level vision problems including video dejittering, impulse removal, error concealment, video coding, and temporal interpolation.