How Content-recognition Software Works

Recently, Time Warner and Disney partnered with YouTube to test video content-recognition software developed by Google. The software is similar to existing audio content-recognition programs in that it analyzes content to create a fingerprint. Then it compares that information to fingerprints in a database to determine if there is a match. However, video presents unique challenges that are not easily overcome.

For example, most videos on YouTube are limited to 10 minutes or 100 megabytes. Since a clip could include any 10-minute segment from a film or television show under copyright, the content-recognition software must analyze the entire original work in such a way that it can make meaningful matches from a relatively small sample clip. Google isn't saying much about how the software manages this, but it's likely that the program analyzes overlapping chunks of the original content to create multiple fingerprints.

Keep Reading Below

Video content-recognition software must be able to identify footage even if the person who uploaded the content edited it first. For example, people can fool software that matches color resolution by tweaking the color saturation in a video. Cropping a video or uploading footage of a film captured on a video camera can also fool recognition software. Some pirated films are captured on cameras set up at an angle to the screen, further complicating the identification process.

One approach developers are trying is to use programs to b­ase fingerprints off an analysis of the changes in motion characteristics in a video. Even this could prove ineffective if someone uploads a pirated video captured on a hand-held camera. In some cases, the probability range for matches may need to be fairly wide to flag all possible cases of piracy. Film studios may discover that they will still need a real person to review video clips to confirm a case of infringement. Still, the initial identification of potential video piracy will be much more efficient.

Video-identification software is still in the testing stage, though some companies are already holding effective demos of their programs. Challenges in identification won't end once the software is perfected, though. The sheer volume of video content presents a big problem. Movie and television studios will need to constantly update their databases with fingerprints for all the new content that comes out every day. While the process for uncovering piracy may become more efficient, it will still require constant upkeep and maintenance.

To learn more about content-recognition software, check out the links on the next page.