Metadata Is A Communication Channel To The Future

Finding something you have stored, even if what you have stored is bits of information, is a challenge. If the content is text, you can do a text search, but if it is sound or video, that can be more difficult. You can create metadata (information about information that remains associated with the original data) that can be used to find content later, but unless the metadata is organized according to some sort of standard format, even knowing the right metadata to find stored content later can be a challenge.

Fortunately, metadata standards for professional content seem to be converging, making the use of metadata a more critical element in a modern media repository. By adding metadata to data, the data becomes easier to find and use. Metadata is a communications channel to the future, perhaps even more...

https://www.flickr.com/photos/sarahseverson/6245395188/in/photostream/ based upon a Jason Scott talk in Sept 2011

Metadata Is a Love Note to the Future

In many modern professional video cameras GPS location detection, as well as time information, can be added as part of the content metadata. This is important metadata, but historically much metadata on what happens in a piece of content or who is involved is entered manually. This can take time and can be subject to error.

There are software tools available to convert speech to text and for image recognition, which can create metadata about the content that it describes. These tools use various types of artificial intelligence technologies such as neural networks that can make probable guesses at words and people from audio and video content. As these technologies are exposed to more and more content and as they learn what is right and wrong, they become increasingly accurate.

Recognizing speed and people in images is just the start of what AI can do to help people manage large amounts of rich content. As processing power available in the cloud and locally increase automated metadata generation from the raw content can become more sophisticated. In addition to creating metadata on what was said, machine learning programs can analyze the frequency components in speech to recognize who is speaking. This can be used in combination with image recognition to improve the accuracy of participant metadata in a video scene that includes audio.

Video image recognition can also include recognition of backgrounds, this can be aided by GPS data captured with the original content. It can also be used to recognize the types of action going on in a set of images. For instance, AI can determine if there is motion in the video (perhaps determining if a scene is being made in a car or are the participants all sitting in a single room). All of this information can become part of the metadata that can be used for finding and managing content.

In the future, AI can determine the emotions and activities of the different participants in a scene, making that information available for accessing and managing the content. AI could even help with content editing that matches actions, participants, dialogs and other important characteristics to get the best version of a scene for every participant with the help of automatic tools that eliminate any discrepancies in the various shots.