Motion detection

Motion detection algorithms generally work by comparing the incoming video image to a reference image. The reference image could be previous frames or a pre-defined background. Motion detection is accomplished by analyzing deviations from the reference, and attributing the difference either to the presence of motion or due to noise, such as untended motion on the camera mount.

When the camera is stationary, a common motion detection approach is to perform background subtraction. With background subtraction, a static scene model is built, which is called the background. Incoming frames are compared to this background in order to detect regions of movement. Many methods exist for background subtraction, with an overview of the most common approaches being described in the Piccardi background subtraction review [1].

Other motion detection algorithms have been proposed, like Foreground Motion Detection by Difference-Based Spatial Temporal Entropy Image [2], which uses histograms of the difference between frames to calculate entropy. The magnitude of entropy is used to determine the magnitude of motion.

OpenCV and motion detection

OpenCV is a library of programming functions for real time computer vision. It includes specialized Motion Analysis and Object Tracking functions [3] and specialized Image Processing and Analysis functions [4]. We can leverage the OpenCV library to implement motion detection algorithms. An example showing how to use OpenCV to implement a background subtraction algorithm can be found in chapter 9 of the O'Reilly Learning OpenCV book [5]. As it turned out, this simple approach was easier to implement by directly implementing the difference algorithm.

Motion detection as part of a GStreamer video capture pipeline

GStreamer is a technology that allows dynamic streaming media pipelines to be easily created. GStreamer consists of hundreds of elements that can be connected in a pipeline. Motion detection can be accomplished by creating a new GStreamer element and including that element in the video pipeline. This will allow the motion detection algorithm (MDA) to easily grab video frames as they move through the pipeline. The rate at which the MDA grabs frames can be adjusted to match the amount of CPU bandwidth that is available. The more bandwidth, the more frames that are processed, and thus the more accurate the motion detection.

The MDA element can report changes in motion detection to the controlling application. The controlling application can take an action, such as causing the GStreamer pipeline to start recording or stop recording based on the changes reported by the MDA.

This design separates the controlling application logic from the streaming audio / video pipeline. Further the actual motion detection algorithm is loosely coupled to the MDA element, so different algorithm can be developed and used without changing the rest of the system. GStreamer even provides a means that the MDA element could be controlled to change the algorithm that is being used without interfering with the streaming pipeline. This might be useful if one low complexity algorithm is used while video is being recorded (when less CPU is available) and a more complex algorithm is used when the system is trying to detect motion.

Approximate median method for background subtraction

From Aresh Saharkhiz [6]: The approximate median method works as such: if a pixel in the current frame has a value larger than the corresponding background pixel, the background pixel is incremented by 1. Likewise, if the current pixel is less than the background pixel, the background is decremented by one. In this way, the background eventually converges to an estimate where half the input pixels are greater than the background, and half are less than the background—approximately the median (convergence time will vary based on frame rate and amount movement in the scene.)

A simple approach to turn the approximate median method for background subtraction into a motion detection algorithm is add counter that is incremented each time a background pixel is changed. Processing multiple frames is required before the background is stable, thus resulting is artificial motion detection reports. Once the background is stable, a noise threshold can be set, and when the pixel change count rises above the noise threshold, motion is detected. Once motion is detected, it will typically take multiple frames after the motion stops before the background becomes stable again.

This algorithm requires the camera to be steady as any movement in the camera (such as wind induced vibration) will cause the image to shift, thus necessitating a more complex algorithm to first account for camera movement.

Motion detection and computational complexity

Typically motion detection algorithms are run on devices containing dedicated digital signal processors. To allow an embedded general purpose processor to detect motion, simplified algorithms may be required as well as reducing the number of frames per second that are processed.

The approximate median algorithm for background subtraction involves taking the difference of two frames and using the difference to determine how to update the background. The most CPU intensive step is the loop doing the pixel by pixel comparison. Using a DM365, that step was performed by using code similar to that shown at the bottom of this page. The 300 Mhz ARM CPU took 5 us to process that step. If we assume that 50% of the approximate median algorithm for background subtraction is spend doing the XXXX step, then we can calculate the number of frames per second that can be processed when the ARM is idle - XXXX fps. If we assume the ARM has XXXX% idle time when recording a video at XVGA resolution, then we can run the approximate median algorithm for background subtraction on XXXX frames per second without interfering with the GStreamer video recording pipeline.

Simplified embedded motion detection

To support rudimentary motion detection on an ARM processor, approximate median algorithm for background subtraction appears to be the best choice. This MDA would be developed, along with the general GStreamer MDA element framework, to allow pipelines to be created that detect motion. If required a more complex motion detection algorithm is required, the code architecture can be reused. Implementation of the average distance algorithm for background subtraction MDA along with the general GStreamer MDA element framework is estimated to be 80 hours.

Hardware accelerated motion detection

One step common to implementing a video encoder, such has H.264, is motion vector calculation. An approach to motion detection is to use the results of the motion vector calculation step to determine if any motion is occurring in the video.

For example, on the TI DM36x processors, you can access the motion vectors as documented in the [wiki.tiprocessors.com/images/3/39/MV_SAD_usage_info.pdf Using MV/SAD information from DM365 encoder in application] document. Although the focus of the document relates to encoding MPEG4 or H.264, the necessary technical information to gain access to the motion vector data is explained.

No tests on the viability of using video encoder motion vector data for motion have yet been carried out. When such data becomes available, this wiki page will be updated.