I am about to head out to Las Vegas for the NAB Conference and data is starting to take a seat at the very crowded NAB table as opposed to past years, which focused primarily on production equipment and related technologies. This is especially evident as the NAB conference sessions and exhibits this year focus on data and over-the-top content delivery.

Data is becoming center stage at NAB due in most part to the consumer’s growing appetite for searchable content. Like most television and movie fans, I am in heaven with the amount of great content available to watch these days and the flexibility of where I can view that content. Of course, I do maintain and pay for several subscriptions (DIRECTV, Hulu Plus, Netflix and Amazon Prime accounts) to get easier access to all of this rich content. I can’t get enough of Better Call Saul, Crazy Ex-Girlfriend, Flaked or more obscure titles like I’ll Have What Phil’s Having. I’m sure you’ve got your favorites too.

What makes this addicting content so easily accessible to viewers like me, are the incredible advances in user experience and how content choices are presented to viewers. But beyond slick new UI’s, it is the powerful search technologies behind the scenes that are really the unsung heroes of this new era of “video everywhere” or ubiquitous content. More than ever, making content fast and easy to find is critical to the success of content creators, distributors and the ecosystem as a whole.

What should be obvious to media and entertainment executives (and not at all to viewers who just want to find, watch and talk about their downloads), is just how technically difficult it has been to pull off an effortless “video everywhere” experience. It’s the kind of thing that keeps studio IT execs up at night.

Why Video Everywhere Is Hard

Today’s production workflows make it incredibly challenging to preserve and manage data throughout the production lifecycle. This is due to the use of multiple disparate software applications that do not share information, making it difficult to organize the data that is collected from these systems. This data wrangling problem continues on long after the production is finished.

I recently discussed this data dilemma at the vNAB Cloud Innovation Conference with my colleague Michael Malgeri. At vNAB we presented on the importance of creating and managing data used throughout the entire production life cycle and discussed the “Suitcase Project” that MarkLogic is working on with the ETC Entertainment Technology Center. The Suitcase is a short film created by Abi Corbin, and the technical test is being led by the ETC’s Erik Weaver. This project is evaluating network-based workflows with a focus on metadata and high-dynamic range for motion pictures. The MarkLogic database platform and the use of the C4 ID system1, which uniquely identifies assets, are the primary technologies for managing the data from the multiple divergent systems used throughout the production of The Suitcase.

Upstream Metadata Extraction

One of the key goals of the Suitcase project is to provide upstream metadata extraction. We are extracting authoritative, structured metadata from the production process so that it does not have to be recreated downstream. Richer metadata lives upstream (pre-production/production) but is subject to change. Data extracted towards the end of the post-production workflow is more authoritative, but not always as robust.

We are able create this new metadata management production workflow by using C4 production framework and IDs. The C4 ID system provides an unambiguous, universally unique ID for any file or block of data. The use of C4 IDs is incredibly helpful for version control of data. As files change, C4 IDs and copies of the files are retained as versions. Semantic data modeling is then applied through the implementation of ontologies (creating relationships between controlled vocabularies and authority lists.) The ontologies are then ingested into the MarkLogic NoSQL semantic database. Instance data is created (the final cut of The Suitcase is “tagged”) and queries are constructed for data extraction.

A search application then leverages the tagged film data, along with user profile information. This combination allows users to view the movie and the corresponding descriptive metadata at the same time. It sets the stage for custom information delivery by matching descriptive metadata with user appetites for information such as funniest one-liners, scenes with specific actors/actresses, scenes depicting certain wardrobe styles and certain action type scenes, e.g. car chases or battle scenes. The final deliverable of this project will consist of a white paper that documents the project, technologies used and lessons learned.

At NAB, MarkLogic will be part of the Cloud Innovation Conference where Erik Weaver and our M&E CTO, Matt Turner will be examining the “Suitcase Project” as part of a panel discussion on cloud – based workflows and the importance of data across the digital supply chain. Add this session to your MyNAB. You can also Interact with us on your mobile phone by playing “Suitcase Bingo at NAB” to win a FitBit Surge.

We will also be helping putting on the MESA Las Vegas Reception at NAB on Tuesday, April, get your invite here. The MESA (Media & Entertainment Services Alliance) has been pivotal in helping raise the importance of data for the media and entertainment industry — and they throw a great party!

I look forward to all of the great conversations and excitement around data at NAB. See you in Vegas!

1Kolden, J. (2015, September 24). The C4 Identification System Universally Consistent Identification Without Communication [Scholarly project]. In The C4 Identification System Universally Consistent Identification Without Communication. Retrieved April 10, 2016.