Introduction to Ambisonics - 360 degree audio

Mahantesh Belakhindi

April 4, 2017

Introduction to Ambisonics - 360 degree audio

Ambisonics is a 360-degree surround sound encoding/decoding standard. 360-degree audio means full-sphere surround sound. Most of the relevant patents have now expired, so one can find open source tools for Ambisonics. Although developed in the 70s, Ambisonics has recently gained traction since YouTube, Oculus VR and Facebook adopted it as a standard for their 360-degree videos. Nowadays, Ambisonics has become a widely used 360-degree spatial audio standard in VR and 360 video use cases.

Figure 1: Ambisonics trademark

Ambisonics provides an immersive experience of surround sound where a listener can locate the sound coming from a particular direction in space. The sound source can also be located at a height, not just on the horizontal plane. So, in an Ambisonics rendering system, you would see loudspeakers not just at a height of the listener’s ear level but even above and below (Periphony). If the listener is using headphones, binaural processing based on HRTF filtering can be applied to Ambisonics output to provide the spatial effect.

Figure 2: Loudspeaker setup for Ambisonics

Ambisonics works on the principle that - if you can record the boundaries, you can reproduce the inside using differential field equations (Kirchhoff-Helmholtz Integral theory). This allows the sound source to be placed anywhere around the listener while encoding. Thus, the encoder takes azimuth and elevation of the sound source to be encoded as input and the listener will perceive the sound source to be present at that particular position.

Ambisonics encoder can accept multiple microphone signals, even mono, to generate B-format outputs. Or one can record B-format signals from microphones like Soundfield and Brahma, where microphones are arranged in a systematic tetrahedral fashion. Ambisonics encoded output would have at least of 4 channels (1st order). These channels are in B-format – WXYZ. W contains the omni-directional content, X contains front-back plane, Y contains left-right plane and Z contains top-bottom plane. Having these channels in this particular format allows Ambisonics B-format encoded signal to be compatible with mono, stereo or 5.1 playback systems.

Figure 3: Mapping of B-format channels

For best spatial effect, we need at least as many loudspeakers as we have B-format channels, preferably a few more.

Number of Ambisonics encoded channels = (order + 1)²

Number of loudspeakers >= Number of channels

Higher order Ambisonics can give better localization of sound source at an expense of additional channels. 2nd order and 3rd order Ambisonics would have 9 and 16 encoded channels respectively. One would need more number of loudspeakers to exploit the advantages of higher order.

Ambisonics decoder can decode the B-format signals to any loudspeaker setup and even binaural in case of headphones, thus making the process of encoding oblivious to loudspeaker setup. This is contrary to other surround sound standards whose channels contain loudspeaker signals (separate channels for loudspeakers placed at the front, back, etc.). Thus, Ambisonics can co-exist with existing mono, stereo and 5.1 setup.

Ambisonics can also be used for post-production mixes. Each sound can be encoded coming from different directions. Also, sound sources can be panned across loudspeakers to give a feeling of moving sound source using Vector Distance Panning or Vector Based Amplitude Panning techniques.

Summing up the advantages of Ambisonics –

• Encoding process can be oblivious to loudspeaker setup at a decoder.
• B-format includes height information making Ambisonics full 360-degree surround sound standard. Periphony arrangement needs to have a minimum of 6 loudspeakers to perceive height information, although the effect is impressive with 8 loudspeakers.
• Loudspeakers can be placed almost anywhere, not required to be arranged in a specific fashion.
• The so-called “sweet spot”, seen in the case of conventional surround sound systems where the listener is required to be seated in a particular spot to get the surround effect, is comparatively bigger in the case of Ambisonics.
• Sounds can appear within the loudspeaker arrangement in the case of Ambisonics, whereas, in other surround sound systems, sounds appear to be coming from the edges of the loudspeaker arrangement.
• B-format can be converted to mono, stereo, 5.1 and binaural processing for headphones. Thus making Ambisonics compatible with any of the existing loudspeaker setup.

PathPartner has ported real-time Ambisonics encoding solution on Qualcomm platform. The algorithm runs on Hexagon DSP as part of the multimedia framework. Our solution is ready to be ported to any platform of interest.