AI-automated video shot listing

The use of AI to automate the production of shot lists and use the expanded metadata to create an AI-assisted first cut is being explored by an IBC Accelerator project led by the Associated Press.

Every year Associated Press (AP) produces around 15,000 hours of live video news output from which a further 3,000 hours of clips are created for distribution. To do so, thousands of hours are spent manually shot-listing and manually transcribing that content. When there aren’t the resources to shot-list and transcribe everything fully, good content can get buried in the archive.

MetaLiquid: Bringing voice recognition and a user-friendly interface to the Accelerator

Sandy Macintyre, AP’s VP News, describes the aim of the Accelerator he is leading as trying to better signpost good pictures and good sound-bites while “removing the ‘grunt work’ and liberating people to do more creative things with the editing and production time.”

Outside the current and unprecedented coronavirus pandemic, McIntyre observes that for agencies such as AP, Reuters or AFP politics is usually the biggest news genre and possibly accounts for as much as 30% of annual news output. Making political news gathering more efficient would be scalable and transferrable worldwide.

The project is supported by the BBC and Al Jazeera and compliments another Accelerator led by Al Jazeera that is looking at practical uses of AI and machine learning (ML) for compliance monitoring.

AP has made an archive of US political content including footage of numerous candidate and presidential debates, rallies, press conferences and campaign trail events available to the Accelerator participants. A wide-ranging catalogue of attributes that the system needs to learn has been compiled including features such as the faces of the main players; steps, stage, podium and so on.

The aim is to automate the shot listing process and then construct foundational edits defined by “recipes”. For example: President Trump walking up to a podium; cut-away of supporters; cut-away of press cameras; soundbite; cut-away to crowd reaction; Trump exits stage; rope-line of glad-handing. The team will also experiment with sentiment analysis where appropriate: for example, to differentiate between supporters and protestors by analysing placards.

Current AI capabilitiesHaving monitored the emerging AI scene for some time, MacIntyre perceives that speech-to-text transcription is probably the best developed AI capability at present followed by facial recognition then object and voice recognition. Sentiment analysis is lagging.

However, improving transcription capabilities further could yield important incremental benefits. For events such as the Democratic Primary debates which, early on, involved nine or ten candidates, achieving frame-accurate identification of speakers would be invaluable.

From improved transcription capabilities it should be possible to more easily search for and publish soundbites containing keywords and would contribute to a much more useful Edit Decision List (EDL) tool.

“The ambition is to get to a working prototype that is able to take raw video and processing it in a way that will create close to a final edited product while removing a tonne of time and effort,” says MacIntyre. The content could also be better because it should be quicker to search all the available footage and identify the most appropriate, relevant and newsworthy content.

Vidrovr is doing the heavy lifting on the training dataset. The New York company’s platform takes all of the data signals that come with a video – text, audio, visual, motion – to provide a detailed understanding of what is happening in each frame. This capability can be used on a live stream to alert a user when something they are interested in appears or to find the optimal clip when searching through an archive.

Metaliquid is bringing voice recognition and a user-friendly interface to the Accelerator. Founded in Italy in 2016, Metaliquid offers a proprietary deep learning framework which can be delivered in cloud or on-premises. The company’s AI algorithms can extract descriptive time-coded metadata in real-time to identify and recognize thousands of different content attributes.

“What broadcasters need is a flexible and efficient feature-extraction tool, able to analyse a large amount of data and to react in real-time. Video content analysis and the subsequent extraction of metadata can be used to improve search in archives and real time footage, selecting clips of interest automatically, and boosting content production.” Explains Giulia Morra, Metaliquid’s US Country Manager.

Metaliquid services use REST APIs and output json files containing time-coded information on what is happening frame-by-frame. The solution can be integrated with any media asset management, production or workflow software packages.

MacIntyre believes the Accelerator is agreat way to discover the current limitations of AI and machine learning and to better understand what is required to teach an ML system how to improve. “It’ll be interesting to discover what it succeeds with, what it struggles with. For example, we know that when the principal in a video of – Trump, Biden, whoever – is inside a big sea of faces it’s going to struggle to pick him up, even if it knows who he is. So we’ll have an opportunity to better understand the impact of resolution, pixilation and so on.”

Noting that every video-first news and media company is currently stretched thinly

Joe Ellis, Co-founder & CEO of Vidrovr says, “The beauty of the accelerator program is that you can get every stakeholder into a virtual room to work collaboratively and constructively to think about where we want to be as an industry in 12-18 months and then chart the roadmap to get there. It’s the perfect opportunity to tackle big hard problems to bring on real transformation.”

To find out more about the IBC2020 Accelerator Media Innovation Programme or to get involved, click here

Timeline TV is no stranger to innovation, but by adapting kit from Blackmagic Design and Apple, it has created broadcast quality lightweight workflows for remote production. Michael Burns finds out what its R D team developed and how it has paid off in these socially distanced times.

Advertisement

More Trends

Al Jazeera, RTÉ, Reuters and AP have partnered on an IBC Accelerator which seeks to enable media organisations to protect the integrity of their content and brands while achieving significant efficiencies, writes Ian Volans.

In the wake of coronavirus, drive-in cinemas are undergoing a renaissance across Europe, but what are the best screen and audio options and can events organisers ever recoup the outlay on the tech investments involved?

IBC is owned by

IBC is run by the industry, for the industry. Six leading international bodies own IBC, representing both exhibitors and visitors. Their insights ensure that the annual convention is always relevant, comprehensive and timely. It is with their support that IBC remains the leading international forum for everyone involved in content creation, management and delivery.