Lessons From the 2016 AES Convention

At the 2016 Audio Engineering Society Convention in Los Angeles, I ran into a friend who was manning the booth of a broadcast audio console manufacturer. I had finally managed to get to the show floor for a quick walkthrough of the exhibits before heading back to another meeting room. He expressed consternation about why they were there since it wasn’t a broadcast show and few attendees would be looking for a high-quality digital audio console.

Along with seeing demos of new products, the AES Convention has become an opportunity to dive into the technology of what we do and see how it is being used. This struck me as odd since there were many years where that was exactly why I went to the show. Manufacturers of larger audio products now tend to stick to broadcast- focused shows since recording and post engineers have gravitated to working inside the box with emulations of hardware, including features from audio consoles.

As I pondered the implication of my friend’s words—that the convention dedicated solely to audio is no longer the place for audio consoles with amazing sound and incredible technology—I realized that the dynamic of the convention has changed for me as well. It’s no longer primarily about getting to play with fantastic new audio products, but it has become an opportunity to dive into the technology of what we do and see how it is being used.

AUDIO FOR VR

Merely judging the show by exhibit space alone gives an incorrect impression of what is going on at the show because the technical program of workshops, papers, tutorials and technical tours were incredibly busy. In addition, the AES Audio for Virtual and Augmented Reality (AVAR) technical conference was held in conjunction with the convention this year, which meant some of us were standing around with boxy, modern day Viewmasters on our faces.

VR and immersive audio seem to be exciting people and offering opportunities in television, film and gaming that may not have previously been on their radar. VR, while an interesting technology, is not one I’ve seen much practical value in until this show when I ran across a small company called Audio Fusion who deservedly won the AES Silver Award in the student design competition.

Audio Fusion has created a virtual studio training environment by modeling an analog recording studio to provide hands-on training for those who don’t have access to an actual studio. Trainees use headsets, headphones and custom controllers to manipulate audio consoles, patchbays and other studio equipment.

NEXT-GENERATION AUDIO

The status and production of immersive and object-based audio, now collectively referred to as next-generation audio (NGA), was discussed in several sessions.

Dolby AC-4 has been selected as the next-generation audio format for the United States, while MPEG-H will be used in the country with the most aggressive timeline for ATSC 3.0 rollout, South Korea. This is the first broadcast format designed to deliver broadcast content to devices of all types, not just to televisions, including new devices as they arrive on the scene.

Mobile devices in particular bring with them an array of level and dynamic range problems due to the nearly limitless number of viewing environments they could be used in. AC-4 will manage these mobile environments as well as static listening environments, by utilizing multiple metadata profiles and rendering audio at each device. MPEG-H proponents remain skeptical as to whether the metadata in AC-4 can survive the distribution process despite the benefits of audio control and interface customization.

Audio Fusion has created a virtual studio training environment by modeling an analog recording studio to provide hands-on training.AUDIO FOR OTT

At last year’s convention, the AES Technical Committee on Transmission and Broadcasting published their technical document for audio music streaming and this year their subcommittee, AGOTTVS, has released technical document AES TD1005.1.16-09, which covers loudness for OTT streaming. “Audio Guidelines for Over-the-Top Television and Video Streaming” provides more than the lengthy acronym for the subcommittee; it also provides initial loudness recommendations for a problem-fraught delivery medium.

There are four recommended practices in the document: the use of agile or static metadata when devices are full-range and distribution is able to support it; a list of how to handle content when the distribution system does not have metadata capabilities along with a recommended loudness setting of –16 LKFS for devices with limited dynamic range; the recommendation that all loudness implementations be tested for anomalies; and recommendations for versioning of the same material with metadata encoded versions left at full range and reducing the dynamic range of versions without metadata.

AGOTTVS is made up of broadcasters, manufacturers and streaming companies, and is headed by NBC’s Jim Starzynski, who helped forge the A/85 loudness recommendations that form the core of the CALM Act. It is a testament to this group that they have been successful in engaging some of the streaming providers in the standards process. Hopefully all of them will get involved as the group’s work develops into a standard.

AUDIO-OVER-IP

Sessions on AES67 drilled down into this maturing technology, covering some large deployments of it in real-world live events. One important point that was brought up, and one I’m not sure I’ve stressed enough, is that AES67 is not a competitor to other AoIP technologies, but is meant to help them all work together.

Video is certainly a big part of the broadcast world, but it was not originally mentioned in AES67 despite the possibility that it could be included later. AES67 has now been adopted by the Joint Taskforce on Networked Media (JTNM) in its Video Services Forum (VSF) technical recommendations TR- 03 and TR-04 for inclusion into the upcoming SMPTE ST-2110 standard. This means that manufacturers adhering to AES67 technical recommendations now have access to the television market as long as they also adhere to the Networked Media Open Specifications (NMOS) from the AMWA.

There were many other interesting sessions this year, and we’ve only scratched the surface of the few covered here, but we’ve run out of space. December is traditionally a time for gift giving and making resolutions, so I encourage you to consider giving the gift of education in some form this year because it truly is the gift that keeps on giving. Keep on learning!

Jay Yeary is a broadcast engineer and consultant who specializes in audio. He is an AES Fellow and a member of SBE, SMPTE, and TAB. He can be contacted through TV Technology or at transientaudiolabs.com.

Where we once simply sent a mix to the primary air chain, they can now end up in places we never dreamed, after going through file manipulation processes over which we have little information or control.

Recently I ran across an interesting piece from the Joint Taskforce for New Media (JT-NM) on the “dematerialized facility,” which envisions broadcast facilities built entirely from commodity IT equipment or with everything outsourced and no onsite equipment at all.

The entire process of sound for film has always fascinated me partly because the working environment seems so extravagant compared to someone who has spent the majority of their professional life working in the trenches of broadcast audio for television.