Binaural Hearing, Ear Canals, and Headphone Equalization David Griesinger Harman Specialty Group Two closely related Threads: 1. How can we capture the complete sonic impression of music in a hall, so that halls can be compared with (possibly blind) A/B comparisons?

Copyright Complaint Adult Content Flag as Inappropriate

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Spikofski’s work at the IRT Munich promoted the idea of “diffuse field equalization” as the natural standard for both dummy head recording and headphone reproduction. The result was implemented in the Neumann KU-81 dummy microphone. I went right out and bought one!

To equalize headphones, put them on the equalized dummy, and adjust the headphone equalization until a flat response is achieved. Good Luck…

Theile published a comprehensive paper on the subject, which suggested that one could make an individual headphone calibration by putting a small microphone in the ear canal (partially blocking it) and then matching the headphones to a diffuse acoustic field.

But this also did not work for me. The resulting headphone equalization was far from natural, and unbalanced between the two ears.

Theile’s arguments however were compelling:

It should not be necessary to measure the sound pressure at the eardrum if one was only trying to match the sound pressure at the entrance of the ear canal to an external sound field.

If headphones are equalized to match a frontal HRTF of an average listener, then ordinary stereo signals will sound very dry and unnatural.

Since such signals are intended to be heard in a room – at some distance from the speakers – the headphones should be equalized to match the total sound pressure in the room.

This implies that the diffuse field equalization is correct for heaphones.

If headphones are equalized for the diffuse field, then dummy heads need to be equalized for the diffuse field.

In this case a dummy head recording will be correctly reproduced.

Alas – this argument implies that a diffuse field equalized dummy head will not reproduce correctly over loudspeakers! This reasoning implies a dummy head equalized for speakers must have a free-field frontal equalization.

The author published a paper on this subject 20 years ago, and had personal conversations with Stephan Peuss at Neumann.

An excellent paper by Hammershoi and Moller investigated whether the ear canal influenced the directional dependence of the human pinna system.

They concluded that measuring the sound near the entrance of the ear canal captured all the directional dependence, and it was not necessary to go to the eardrum.

This paper has been taken as conclusive proof that the ear canal is not relevant for headphone equalization or dummy head recording.

But Hammershoi and Moller say “The most immediate observation is that the variation [in sound transmission from the entrance of the ear canal to the eardrum] from subject to subject is rather high…The presence of individual differences has the consequence that for a certain frequency the transmission differs as much as 20dB between subjects.”

Thus the directional dependence is correct – But the timbre is so incorrect that our ability to perceive these directions is frustrated. (And the sound can be awful..)

The work of Spikofski, Theile, and Moller all rests on the assumption that human hearing rapidly adapts to even grossly unnatural timbres.

That is, the overall frequency response does not matter for localization, only relative differences in frequency response.

Alas, this is exceedingly unlikely. It seems clear that rapid, precise sound localization would be impossible without a large group of stored frequency response expectations (HRTFs) to which an incoming sound could be rapidly compared.

Human hearing does adapt to timbre – as we will see – but adaptation takes time, and needs some kind of (usually visual) reference.

That absolute frequency response at the eardrum is unimportant for binaural reproduction is seductively convenient. But it violates common observation:

The argument is based in part on the perceived consistency of timbre for a sound source that slowly moves around a listener.

But perceiving timbre as independent of direction takes time. If a source moves rapidly around a listener it is correctly localized, but large variations in timbre are audible.

Clearly the brain is using fixed response maps to determine elevation and out-of-head impression. And compensating for timbre at a later step.

I was just in the Audubon Sanctuary in Wellfleet at 8am, surrounded by calling birds in every direction. I felt I could precisely localize them – but I could tell you nothing about their timbre.

Walking under an overhead slot ventilator at Logan at about 3.5mph, I noticed a very strong comb-filter sound. When I retraced my steps at 1.5mph the timbre coloration was completely gone. In both cases the sound was correctly localized.

Bottom Line: Accuracy of frequency response AT THE EARDRUM is essential for correct localization with binaural hearing.

It has been noticed that standard ear-canal-independent methods of calibrating dummy heads and headphones do not work very well.

It is almost universal that subjects claim headphone images localize inside the top of the head.

However, when a dummy head tracks a listener’s head motion there is sufficient feedback that a frontal image is restored.

Although the process may take a minute or so.

Therefore head tracking has been assumed to be an essential part of any dummy head recording system.

But none of us need to move our heads to achieve external, frontal localization.

Head motion produces azimuth cues that are so compelling that the brain quickly learns to ignore timbre cues from the pinna. But this is not an ideal solution, as issues that depend on timbre, such as intelligibility and sound balance, are incorrectly judged.

This graph shows the frequency response and time response of the digital inverse of the two probes as measured against a B&K 4133 microphone.

Matlab is used to construct the precise digital inverse of the probe response, both in frequency and in time. The resulting probe response is flat from ~25Hz to 17kHz. In general, I prefer NOT to use a mathematical inverse response, as these frequently contain audible artifacts. I minimized these artifacts here by carefully truncating the measured response as a function of frequency.

Carefully place headphones on the listener while the equalized probe microphones are in place.

Measure the sound pressure at the listener’s eardrums as a function of frequency, and construct an inverse filter for these particular phones.

If this is done carefully, the sound pressure during the recording will be exactly reproduced at the eardrum

With several tries, a very successful equalization can be found.

I prefer to construct an inverse filter using a small number of minimum phase parametric filters, rather than a strict mathematical inverse. The mathematical inverse tends to over-compensate dips in the response.

The biggest problem is that no-one (in their right mind) will put anything in their ear!

Bigger than their elbow…

But if a madman equalizes a system for himself, can others obtain the benefit?

Considerable benefit is obtained. Most individuals say the headphones sound amazingly realistic in timbre. But frontal imaging may not work well. In my experience there are large differences between individuals in the way high frequencies couple from headphones to the eardrums.

The consequences of these individual differences [as described by Moller] – and what can be done to mitigate them – are the subject of the next section of this talk.

In general, a non-invasive equalization procedure is frequently sufficient to make a realistic playback.

But timbre (the overall equalization) needs to be corrected. Because the actual ear canal transform is unknown, timbre (and elevation) is usually not accurate.

Is it possible to achieve out-of-head localization and frontal imaging with headphones without a head-tracker?

Yes - we do it with our own ears every day. When timbre is accurate it is also possible with headphones. With some adjustment to headphone response non-individual HRTFs will work for most people (not all…)

Is it possible to achieve out of head perception with a simple delay, without using a measured HRTF?

Yes – but beyond the scope of this talk

What HRTFs should be used in concert hall or car modeling?

There is probably more variance in ear canal geometries than in pinna. Some kind of individual matching for timbre is needed.

What is the meaning of “flat frequency response?”

The sound pressure at our eardrums is not at all flat, and is different for each individual, and for each sound direction.

Our impression of response is adaptive – but… there are limits.

Altering loudspeaker elevation

Can a speaker on top of a screen, or in the headliner of a car, be made to sound in front of the listener?

The silicon material was “Dragon-Skin” from Smooth-On with hardness of Shore 10.

The cured silicon positives are covered with more silicon to produce a durable negative for further reproduction.

The outside surface of the silicon pinna are cut away with a small scissors to reproduce the compliance of a real pinna, which varies from shore 3-10.

Tiny probe microphones are attached to the apex of the eardrum cavity, and a resistance tube of about 3m in length is attached to the center of the eardrum to simulate the eardrum resistance. 18 gage PVC was used.

The probe microphones were calibrated to be flat to about 14kHz as referenced to a B&K 4133.

DSP is used on the microphone outputs to apply the resulting equalization.

The result matches probe measurements of my own ears within about 2dB.

Paraffin wax is used to fill the space inside the head around the ear canal and resistance tube to eliminate microphonics.

The outer head was cast with a high-density artist’s foam material from Smooth-On. This material is easily formed and cut.

What if we do the identical experiment, but use a loudspeaker in front of the listener, accurately calibrated to produce frequency linear pink noise?

Surprisingly, the listener produces (on average) the following curve:

This is a 6dB drop at 3000Hz with a Q of 2. If we add a complementary boost to a headphone equalization based on equal loudness, the result is amazingly satisfactory on ordinary recorded music. The loudspeaker and the headphones have the same timbre.

If we combine the two curves above – that is the quiet expectation, and the frequency boost needed to match loudspeaker reproduction, we get a curve that looks like this:

A recording made at the author’s eardrum with probe microphones that have a flat frequency response can be corrected with the inverse of this curve. This recording then sounds remarkably good on loudspeakers, and plays correctly through headphones equalized with the above curve.