Is Binaural Dead? A Response to Sonic VR’s Latest Blog Post

Before I Begin, Watch This. And Wear Headphones.

Now read this definition of Sound Localization:

Sound localization refers to a listener’s ability to identify the location or origin of a detected sound in direction and distance. It may also refer to the methods in acoustical engineering to simulate the placement of an auditory cue in a virtual 3D space (see binaural recording, wave field synthesis)

Were You Able To Localize My Voice Moving Around You?

Yes? Yes, okay.

The video above was captured with binaural microphones. You just listened to binaural audio.
I’ve shown this video to Broadway sound designers, post production film mixers, my family, my landlord — each and every one of them is able to follow me as I move around them. This is localization.

Recap from the last blog posting: headphones bypass the part of our auditory system that helps you localize sound, 3D sound can be broken into three elements: Sound source, Environment, and You, and Sonic VR is creating headphone solutions that put You back into the sound.

Before outlining the fundamental problems with this accusation, let’s first point out some great points that the folks at Sonic VR make:

1. BINAURAL HEADPHONE HAS A SPEAKER AT EACH EAR.

All headphones used for entertainment play sound at both ears. In this sense, almost all headphones are “binaural.” Some companies advertise this as a feature.

This is in contrast with some communication headsets, like those worn in customer call centers or by NFL coaches, which may only have a speaker at one ear.

This is spot on. For most sound professionals, binaural audio refers to a type of recorded audio. It defines the recorded audio that you, the user, are listening back to on headphones. A pair of speaker drivers on your head (headphones) can’t be binaural. They can’t turn flat 2-dimensional sound into 3-dimensional externalized binaural sound. Otherwise put: Binaural cannot be replicated by software; it is either record binaurally from the source or it isn’t.

2. BINAURAL (RENDERING) TAKES SOURCE RECORDINGS AND PLAYS THEM BACK WITH SPATIAL PROCESSING FOR A LISTENER WITH TWO EARS.

The problem is that the “listener” in these algorithms is a generic humanoid head, at best.

This has existed for a number of years with surround sound headphone technologies. It is being rebranded binaural audio, as of late.

For headphones, this adds some spatial information. However, as with binaural recordings, if the head and ears are not the same as for the listener, it doesn’t really work. The perceived locations and sound quality are off.

Also correct. Binaural rendering can only get so far in processing individual sound sources before it becomes unreal. We’re talking very specific algorithms designed to interpret a sound source. This type of technology thrives on heavily prepped audio tracks by pro mixers to create a playback experience.

What happens when VR moves from consumption tool to creation tool? Binaural rendering, says Sonic VR, will not be able to keep up with new recording techniques being used to capture immersive audio.

Here is a dissection of a few of Sonic VR’s more problematic points:

1. Binaural recording “will always fall short of reality” because the head used to record the audio is different from the head being used to listen to the audio. Essentially, we don’t actually have the ability to localize sound with binaural audio:

If you are listening to a binaural recording done with another’s head or with a dummy head, the sound quality and location accuracy to suffer. In fact, what is in-front of one listener can sound behind for another. Most of the time, the effect of binaural recording is a slightly spatial sound, but with very poor balance.

For this, I encourage the listener to decide for themself. A simple search on YouTube for Binaural Music or Binaural Recordings will provide you the listener with plenty of examples. These where recorded with a head very different from yours — does it sound unnatural to you?

IMMERSIVE AUDIO IS AND WILL BE SOMETHING INTEGRATED INTO OUR DAILY LIVES, BUT WILL IT COME IN A HARDWARE OR SOFTWARE PACKAGE?

2. Software-based companies resist binaural audio as it is a hardware (binaural microphone technology) solution. Hardware based companies resist software for the same reverse reason (even though all hardware incorporates complex software).

To argue the sake for binaural is to side with the evolutionary masterpiece known as the human ear. What our ears do on a daily basis, hardware tries to mimic and software tries to interpret. Using sound waves, human ears can detect speed, direction, even the size of a room, using very simple time/volume differentials between the left and right ear.

Our ears are the ultimate 3-dimensional binaural microphones. And they do this by utilizing the earlobe to funnel and reflect acoustic pressure waves as they enter our ears. Simply put: if we had no earlobes we’d experience the world completely different. All one have to do is place a microphone (albeit very specifically within the pinna) to have the highest form of sonic reproduction known to man: capturing sound exactly the way ears experience it.

So the discussion continues between localization and spatialization. The end user will ultimately decide between experiencing audio the way our ears record it vs. the way software interprets it. But in the conversation between human versus machine, when human is a viable option, who would want to experience sound captured or processed any other way?

Join the discussion 3 Comments

Hello Anthony. One quick question about the video you linked. It’s my understanding that the Virtual Barbershop demo is a Starkey Lab’s "Cetera Algorithm" demo. Is there any difference between the Cetera Algorithm and actual binaural audio that’s recorded using a person’s or a dummy’s head?

Hey Jason! Great question. Yes, you are correct in saying this utilized software. However, it was software used to demonstrate binaural audio technology. Which means it was showcasing a sonic environment by way of spatialization, something only binaural technology can achieve. It’s a bit confusing because most software solutions these days utilize spatialization to accomplish said sonic environments. Localization utilizes the way an ear naturally processes sound within an environment to create a sonic environment, ie. binaural. In this case, though it was not used with a dummy head, it was processed to appear as though it was. Normally software doesn’t go this far to reproduce localization, but in my opinion it only further proves that localization is the way to go when generating virtual sonic environments. Hope that helps!