SMPTE kicked off its 2017 Annual Technical Conference & Exhibition on Monday with an all-day symposium on artificial intelligence and its emerging role in entertainment production and distribution. Among the day’s presentations, SMPTE’s Richard Welsh presented a compelling primer on AI, Google’s Jeff Kember discussed the differences between supervised and unsupervised systems, Hitachi Vantara’s Jay Yogeshwar addressed using machine learning and AI for production workflow, Yvonne Thomas of Arvato Systems looked at the value of effective data analytics, Greg Taieb of Deluxe addressed language localization for multilingual distribution, and Aspera co-founder Michelle Munson examined next generation network design.

Richard Welsh, SMPTE’s VP of education, began the day with an excellent 30-minute primer on AI. The term AI is used to describe three very different levels of computational capabilities.

Machine learning (ML) involves algorithms that are trained to do a specific task. Deep learning (DL) involves regenerative algorithms; they can learn through unsupervised feedback loops, but are cognitively constrained to one area of expertise. General artificial intelligence (GAI) involves free learning networks.

GAI is what most people think of when they think of AI, but GAI does not yet exist in the commercial arena. Most of what people and marketing calls AI is really a machine learning implementation.

There are two forms of machine learning algorithms, regressive algorithms (decision trees) and classification algorithms (index and compare). Welsh cautioned the crowd not to anthropomorphize the technology. AI does not mimic the human mind. It is “necessarily incomprehensible” he said, which can lead it to come up with unexpected yet valid responses.

Jeff Kember from Google’s office of the CTO discussed the difference between supervised systems where you have a specific output in mind, and unsupervised systems in which reinforcement learning takes place and both the algorithm and the output are modified.

He described how machine learning will enhance the end-to-end production workflow, from adding content and technical metadata to camera raw through supporting multi-channel day-and-date distribution and long-term archiving.

Yvonne Thomas, product manager for Arvato Systems, explained how data analytics become both more valuable and more complex as you move down the descriptive, diagnostic, predictive, and proscriptive analytics chain.

She described how you train a machine learning algorithm this way. Once you input your starting data set, the algorithm outputs a decision, the system accepts feedback on that decision by comparing it to the established set of rules and goals, and the algorithm’s parameters are adjusted prior to the next round of input. She stressed that “using a media analytics service requires some sort of social responsibility.”

He calls the language localization process “transcreation” because you are trying to recreate the sense of the dialog as you translate, taking culture, slang, and other elements into account.

He sees AI starting to play a role in all of this, especially with the rapid development of resources like Google Translate. He also sees a new job title developing, post editor, responsible for QC-ing the machine translation.

Martin Wahl, principal program manager for Microsoft’s Azure Media Services group described the company’s Cognitive Services bundle. It includes visual language, speech knowledge, and most interestingly video indexing on a frame-by-frame level.

In response to a question from a Library of Congress archivist, Wahl noted that they use their own metadata nomenclature, so any merging into a standardized database would require transcreation processing.