Where we're going with Headphone Measurements

Loyal readers of InnerFidelity will naturally be concerned that with Tyll’s retirement the regimen of headphone measurement here will be diluted or even abandoned. Not so. In fact we have plans not just to continue to measure headphones but to expand and improve both the measurements themselves and the way you can access the results. This is a work in progress so we ask your patience while we put in place the new hardware and software that will enable it. But here is an initial insight into what’s in store.

The biggest change on the hardware side is that I will not be performing measurements with the Head Acoustics artificial head that Tyll used. Since 2007 I have been using an artificial ear made specifically for headphone measurement by the Danish company G.R.A.S. Sound & Vibration, the same equipment as was later chosen by Harman for its extensive subsequent (and ongoing) research into headphone target responses. Those of you who are Stereophile readers may remember that I wrote a feature about headphone measurement in the August 2008 issue, which you can read here HERE if you are looking for background on the issue of headphone measurement, although this was written before Sean Olive at Harman began his work on target response. I see I began the piece by saying, “Headphones get pretty short shrift in much of the hi-fi press, which is puzzling – the headphone market is burgeoning.” How things have changed.

Included in that article is a description of the G.R.A.S. 43AG ear and cheek simulator I use selected parts from, but there have been developments. One of the 43AG’s important features is that it uses the anthropometric artificial pinnae (outer ears) originally made for the KEMAR head and torso simulator (now sold by G.R.A.S.). These are based on a large number of measurements of real human ears and – just like most pairs of human ears – they are asymmetrical: the left pinna is not a mirror image of the right. While it may seem odd to test headphones using left and right pinnae which are differently shaped, this actually reflects normal use; human beings are not symmetrical either. Some headphones cope with this better than others.

Recently G.R.A.S. further improved its artificial pinnae not by changing their shape – that remains as before – but by improving their sealing to the 43AG and making them softer, so that they more closely mimic the physical properties of real ears. This is particularly relevant to supra-aural (on-ear) headphones, and circumaural (over-ear) headphones which have shallow ear spaces. In either of these an artificial pinna that is too stiff can prevent effective sealing of the earpads, resulting in curtailment of bass response. This can be a significant problem if you choose the wrong artificial ear or head – as Rtings.com discovered – requiring it to develop a complex measurement routine to obviate the problem. By using the 43AG with the new G.R.A.S. pinnae we’ll tackle that issue at source.

Even with this problem addressed, headphone frequency responses are inherently variable. Insert headphones produce consistent frequency responses provided that a good seal is achieved, but on-ear and over-ear headphones do not as a result of small differences in positioning relative to the ear. I address this variability by removing and replacing the headphone between each of 10 separate response measurements per channel, and then calculating a mean response and 95% confidence limits. On a graph the results look like the example below (Figure 1). The solid red line is the mean response and the blue shading to either side shows the 95% confidence limits (ie; we can be 95% certain from the measurements that the true mean response lies within the blue area). This form of display is much easier to interpret than multiple overlaid responses. It shows if the headphone has a sealing problem at low frequencies (this one does) and where else in the spectrum the largest disparities occur as a result of small changes in positioning.

As a refinement of this, I will be measuring the effect of spectacles on sealing and also simulating the leakage caused by human hair. Some headphones are barely affected by these; others can suffer significant loss of low frequency output.

The response shown in Figure 1 – measured at the DRP (the drum reference point, ie; the eardrum) – is uncorrected, and here lies the biggest issue for both headphone design and headphone measurement. What correction should we apply to create a response that’s flat for a headphone with neutral perceived tonal balance? This is a question I addressed at some length in the Stereophile article but since that was written Harman has suggested an alternative target response, requiring a different correction. What’s more, Harman has identified different target responses (and so different corrections) for in-ear and on-/over-ear headphones, and the latter correction has just been revised.

In effect this means there are now three principal headphone response corrections that have been proposed down the decades: free-field (FF), diffuse-field (DF) and Harman (in its various flavours). Rather than choose between them – although the free-field correction is largely historical and discredited – I apply all three, with a result like the example below (Figure 2). The response traces here are third-octave, by the way, because the FF and DF corrections I use (which are due to Møller, whose DF correction has been identified by Harman’s research as the best of the various DF corrections available) are defined at third-octave intervals.

Headphone distortion measurement is riddled with issues – issues of relevance, and issues of accuracy. The main problem of relevance is that it remains common practice to measure total harmonic distortion (THD) when it has been known since the 1930s that this correlates poorly with our perception of distortion –I wrote about this in another Stereophile article, HERE. We will add weighted THD versus frequency initially, which is more representative of distortion’s perceptibility, with GedLee Metric (Gm) and Non-Coherent Distortion (NCD) versus frequency measurements being added as soon as possible.

To perform these enhanced distortion measurements accurately it is necessary to address the bane of headphone distortion measurement: microphone self-noise. The G.R.A.S. 40AO microphone in my artificial ear has a specified self-noise figure of 20dBA SPL. So when measuring headphone distortion at 90dB there is only 70dB signal-to-noise ratio at best. An all too obvious manifestation of this problem in published headphone distortion plots is the distortion measured at 100dB SPL being lower than at 90dB SPL (!) over those parts of the frequency spectrum where the distortion level is low. This is spurious: the signal-to-noise ratio is higher in the 100dB SPL measurement, so the THD+N (THD plus noise) trace can drop to a lower level than in the 90dB SPL measurement before noise dominates.

You could argue that at around 0.1%, say, distortion becomes irrelevant anyway because it’s imperceptible – but that’s no reason to perform a poor measurement. To fix the problem there is no substitute for time: you have to measure for long enough at each frequency to perform sufficient coherent time averaging to reduce the measurement noise floor to far enough below the bottom end of the graph scale that the graph shows only distortion, without significant noise contribution. I’m working now on software that will implement this, with the intention of reducing the noise floor to at least 100dB down ref 90dB SPL. This will allow accurate distortion measurement to −80dB (0.01%), and I’m hopeful of achieving much better than that.

Of course, this effort only makes sense in the context of measurement equipment that has very low levels of inherent distortion. To ensure this I am currently buying the parts for and will shortly build a very low distortion headphone amplifier and very low distortion microphone preamplifier specifically for headphone distortion measurement. And we will be unique, I think, in publishing spectra of the residual distortion so you can be assured that our distortion plots show only the distortion of the headphone being tested, without contribution from the test equipment.

The other problem with implementing headphone distortion measurement is that conventional artificial ears – and my 43AG is no different – have an upper limit of frequency response accuracy at around 8-10kHz, so measurement of distortion at high frequencies is inherently inaccurate. G.R.A.S. has gone some way to alleviating this problem recently by introducing a High Resolution Ear Simulator which extend the frequency range out to 20kHz, but this only partly resolves the issue. I’m hoping to address this limitation in a more fundamental way – but I’m keeping mum about what I have in mind until I’m convinced that it’s workable.

It’s also a little early to say precisely how we intend to provide you with enhanced access to the measurements but I can confirm that they will still be downloadable as a single file for each headphone, but the file will not be a PDF as before. It will be hi-res, though, and for the first time will include some interactive features.

I’ll have lots more to tell you about our new measurement regime as it is rolled out. For now, I hope you’ll be reassured by the foregoing that InnerFidelity will continue to measure headphones, and intends to do so even better than previously.

Maybe it's possible to bring the huge existing database into context with the new measurements by building a correction curve for the old measurements.
It would be necessary to do measurements on a number of specimen that already had run through Tylls rig and determine the difference to the new setup, separate for around ear, on ear and in ear.
I bet that also not perfectly precise, useful results could be obtained.

For the distortion measurement a microphone mounted in a flat plate could be used instead of the artificial ear.
Measurement microphones with responses up 100 kHz are obtainable, with compromises in their noise level.

It would be nice to have a target curve where neutral sounding headphones show something close to a straight line. Also this request sounds trivial in the first place, I know that's a really hard thing to do.
Tyll tried to do that in his visit at Harman, but finally stopped working on that.
His approach was basically correct, but needs more work specially on the measurement side.
His main problem was how to deal with the fact that two speakers feed signal into two ears.
The simple solution is, using uncorrelated pink noise for the measurement.
Even better if the pink noise blends into correlated in the LF range ,e.g. below 300 Hz, which is even closer to how music usual is produced.

There may not be a single straight line for all headphones, but if a particular headphone is truly neutral, the computer should be able to generate a flat line to represent it. The graph of the algorithm that creates that flat line from the raw data should be similar to what's measured for other headphones, and if not, it needs a good explanation.

Straight line means the target response is met, if this is neutral or not depends on this target response, not more. This means the target response is essential for a meaningful results. In acoustics we call this calibration.

If I were to do it I would calibrate the system by myself.
Simply using the factory supplied correction curves doesn't do the job because they do not take into account the translation from two speakers into a binaural configuration.

This is what Tyll tried to come around on his visit at Harman's reference room.
Unfortunately he did not work around this:
Doing measurements in a more or less normal living room style environment has its problems. The human ear, connected to the brain does, to some degree, separate the direct and the reflected sound. A measurement microphone just as it is doesn't.
Fortunately there are measurement procedures using time windowing that can do the same.
I would experiment with different measurement methods and check the results for reasonablity on known good headphones.

I'm not sure WaterFall Plots are useful... Yes, they do show us how a peak in the frequency response decays over time but the same information is in a frequency response plot as low and high Q resonances. High Q resonances are visually indicated as narrow peaks and dips on a frequency graph and low Q resonances are indicating as broad peaks or dips on a frequency graph.

What's worse, waterfalls trade-off time resolution with frequency resolution. Higher resolution in the time domain comes at the expense of lower resolution in the frequency domain. The errors are most often shown at low frequencies if the size of the time window is too small to accurately characterize low frequency amplitude response.

Yes, time and frequency domain are connected, but the Q information gets lost with plot smoothing,
I value what I can read out of a waterfall plot. That does not mean it's as easy to say what can be expected sound wise as it can be from a frequency response plot. But the same applies to the distortion plot.

As long as the frequency resolution is 1/20th points per octave or higher there is sufficient resolution to see high and medium Q resonances across the frequency bandwidth. We typically use 48-points per octave log-spaced from 20 Hz to 20kHz with the option of smoothing it to 1/24th or lower, This is sufficient resolution to accurately represent medium and high Q resonances without having to go into the time domain.

I applaud this move towards more perceptually meaningful headphone measurements which include a correction based on the Harman Target Curve. We've spent the past 5 years doing research on the preferred target response of in-ear and around/on ear headphones, and there is a clear relationship between measurements and perceived sound quality.

We recently completed a headphone study presented last week at the 144th AES Convention in Milan, where 130 listeners evaluated 32 models of AE-OE headphones that included the latest AE-OE Harman target curve. If you click on this link https://pbs.twimg.com/media/DeEtEeqXkAUq-2d.jpg you will see a graph showing the average frequency response of headphones (blue curve) that fell into four categories of sound quality based on listener ratings: Excellent, Good, Fair and Poor. We plot them against the Harman Target Curve (Green), and the Error Curve (red) which is the difference between the Harman Target Curve and the measured curve of the headphone. The smaller the error, the closer the red curve is to 0 dB. Hopefully, you can see that headphones tend to receive lower preference ratings as they deviate from the Harman target curve.

We also show a regression line (dotted curve) of the average error curve. As slope of the line increases from flat (0 dB) headphones generally get lower ratings. I think you can also use the error curve as a way to interpret the spectral balance of the headphone. The link to this study can be found here:http://www.aes.org/e-lib/browse.cfm?elib=19436

As long as there will be measurements I will be fine and dandy. That's the big reason I come to Innerfidelity as there's lots of subjective reviews around but very few measurements to back them up. That's not to say I'm like just looking at graphs and make purchases based on that, no no, it's the combination of subjective analysis and measurements.

Tyll was particularly good at describing the sound having heard so many headphones it started to be possible to speak about headphone x has reasonably wide soundstage or headphone x might have slight ringing going for it in the upper-midrange or the bass has pretty good punch to it etc. I find that's the headphone reviewer's biggest task, to try explain how it sounds like (in comparison to other headphones).

I suggested several of these changes to Tyll long ago, but his software, or his ability to manipulate it, did not cooperate. It is also possible to compute a difference curve between free field or Harman and your results. I would show that.

As for THD.... I've long long long long ago been a proponent of measuring THD, not THD +N. Especially when N interferes. Since as you point out, there does not seem to be a correlation between measured THD and perception, this is an academic argument, but as you say, "at least we did it right." As for weighting, I'll be intrigued to see what you come up with.

Separating 'noise' from correlated and non-correlated HP produced distortion charts is a wonderful idea. Since the THD+N measurement is ideally the difference between input signal (noise and all) and the HP output, the noise that matters in this test is solely that produce in the measurement hardware and electronics. If complex musical signals are used for the test in place of a sine sweep, wouldn't a simple THD+N % figure give a practical, and useful number to compare, along with the THD+N vs frequency chart?

Do you think an IMD test would be helpful, with this too charted for at least two output SPLs. ?

I think your suggestion of rating headphones' characteristic sensitivity in dB, is a specification of much value, to go with the impedance vs freq. & phase curves. This would go hand-in-hand with the impedance curve to help determine amplifier needs.

Actually, THD+N is an antiquated measure that I believe tells us nothing about the THD when the noise is high. The noise should be separated from any THD measure. The only reason the THD+N measure persists is force of habit, and some older measurement devices which are not capable of separating the distortion from the noise.

I agree that IM is also important, but I'm more familiar with its use in amplifier measurement and cannot speak to its relevance to transducers. A more knowledgeable person would have to speak to that.

Thanks for reminding me about the headphone sensitivity expressed in dB, relative to a voltage. That's the only relevant number since it is the voltage that creates the SPL in a direct relationship, not the power.

I would be interested to know if any IM is generated by headphones. It _is_ generated as a mixing product by the non-linearities of amplifiers, but perhaps headphones (or speakers) also have certain physical non-linearities in their construction.

Maybe it is interesting to split the harmonics in three groups based on how good or bad they usually sound. The ones sounding good can be split between octaves higher than the base tone and consonant harmonics (fifth, fourth, major third and minor third note distances).

The octaves are powers of two harmonics (2, 4, 8...) and make the sound richer (until a certain degree of course).

The consonant harmonics are harmonics until the 6th and all power of 2 multiples of these (so 3, 5, 6, 10, 12, 20...). These match with note distances in chords. However, it depends on the music whether these frequencies sound good together or not. And perhaps personal preference.

All other harmonics do generally not contribute positively to the sound.

When I heard that the measurement regime at InnerFidelity was changing, I greeted the news with much consternation. It sounds like you're giving this a lot of thought and will be making some exciting changes, for which I applaud you! I do think that the existing database of measurements is a valuable resource for comparing headphones, and I know that the new database of measurements will become such over time. Do you have any plans to make Tyll's compensation curve available along with the Harman Response, Diffuse Field, etc.? I know that the measurements wouldn't be directly comparable because of differences in the measurement rig, but it might provide some basic reference point. Also, do you plan to re-test any of the popular headphones that are in Tyll's database just for purposes of getting updated measurements, or will measurements only accompany new reviews?

As for the rest, I am not so sure. InnerFidelity will probably become more like Stereophile, focusing on overpriced gear (as exemplified by $ 3K being mentioned as a "starting point" on the last post), with little to no actual criticism. Advertisers will love it.

If you are going to reference something I said, and accuse me of bias, please have the courtesy refer to to me by name.

Oh, and include the context of the reference.
Here it is for those who would just read your post and have no context for it:

"I thought this would be a reasonable rig to use as a starting point in both price and performance for reacquainting myself with planar-magnetic over ear ‘phones, since my two-channel work/review schedule (and moving into a new place with my girlfriend, son and daughter) had meant it had been a while since I had high-end cans in the house, never mind on my head."

I guess the part where I specifically said "high-end cans" didn't translate.

If that's not your price range, that doesn't mean there's anything wrong with the gear.

I've been really busy lately working on this site, and its sister publication, so I've not had time to wade through the weeds of comments as much as I'd like.

1. You are right, I should have included the quote. Thank you for providing it.

2. I didn't mean to accuse you of anything. I merely provided my personal opinion regarding the future of the website. I said it would probably become more like Stereophile, would focus on what I consider to be overpriced gear, and that advertisers would be pleased with it. See? None of that is an accusation - either to you or InnerFidelity.

(3. I also didn't say anything about the gear.)

I would like to apologize if I offended you in any way. You have every right to run InnerFidelity as you see fit.

Just one thing: it is not always necessary to be so reactive. Being polite, and being friendly, usually works better on the long run.

is what I've been since day one on this website. As well as inclusive, forthright, thoughtful, hopeful and happy.

Your apology is accepted, but let's be perfectly clear, you impugned my reputation by saying:

"InnerFidelity will probably become more like Stereophile, focusing on overpriced gear (as exemplified by $ 3K being mentioned as a "starting point" on the last post), with little to no actual criticism."

I don't appreciate you suggesting that I'm some shill who only writes something a manufacturer salivates over.

I've been a national-level newspaper journalist for 20 years prior to this. I don't take kindly to being told what to write, or for that matter what to do when it comes to my beat.

And make no mistake, this is my beat now, and I plan to treat it as such: with respect, an intelligent vantage point, sincerity, hard work and an eye to having fun.

Why, instead, don't you ask the offenders to remove 'indecent' words or their posts with 'indecent' words ? When you block ALL COMMENTERS because of actions of one or two individuals then your retaliation hits utterly wrong targets. Jesus Christ was against the Pharisaic method of collective punishment. If this is how the things will go on this website then soon there will be no comments sections at all. Some miserable sociopath will deliberately post indecent words in order to eliminate all discussion and you will oblige him.
I'll try to find a source/text on how to insult in a sophisticated and civilized manner, without resorting to gross words straight from a ghetto, so that people can learn how to insult in style. The link then should be posted at the beginning of each comments section.

We need a reliable definition of "Swearing". Swearing is used in a Court of Law! for gods sake

If you wish to apply a filter to our common language, can it be Webster's English Dictionary or OED ? Daniel Webster brought standardization to us English Speakers, it put the Entire British Empire into a shared communication system that we are enjoying to this day. If you Rafe Arnott are to apply further limits to expressions, I will need it in writing from your Publisher and Owners.

Words mean things, limiting word usage is limiting the utility of language and the silencing of meanings.

People use words as tools, words in common usage have common meanings that people are trying to convey.

Be careful here, people in the USA have Constitutional RIGHTS : Free Speech is the most cherished and valued. Limiting Free Speech can land a person in Jail. ( where there is no FREE Speech )

I'm looking at implementing a new comments-section policy on the site which when used, will be easily legible to all who choose to spend their valuable time here.

So we are crystal clear, disabling comments on any post on this site is completely at the discretion of the editor. That said, I'd far and away prefer to never have to do that. I don't enjoy doing it. I want this site to be fun for all people to engage in.

Moving forward, disrespectful, hateful, or vindictive posts will be deleted with an explanation.

Ps. it should be helpful to have an EDIT ability to these comments because it's rather easy to have a mis-stated comment get past the first and second re-writes to then be Un-Editible a few hours later, where a nice bit of re-writing would self-correct a problem comment. We have the Edit capability on the Stereophile Comments sections, don't we?

I just logged in so I could comment on the previous discussion, and I was surprised to see comments disabled.

When I think about it, the undisabled commentary may currently have too much poison to be considered healthy for Rafe. The fact that Tyll is gone does not justify focusing on the negative IMO.

However, going forward, I would have one wish for Keith: could you write a short summary of your listening experience to each review? This could be later extended if majority of readers turn out to have an issue tuning into Rafe’s lifestyle.

Not sure if this was what the prior poster meant when he talked about Mr. Howard giving listening impressions. But for me, by far the most interesting aspect of the measurements was when Tyll attempted to correlate them with what he subjectively heard.

Sometimes the measurements lined up with what he heard, but often they did not... and Tyll wasn't afraid to discuss and wrestle with that fact. It seemed like he was working towards a better understanding of what the measurements did and did not mean. He never finished that journey but progress was certainly made.

In the end I think that is what I'm most curious about with the separation of your listening and Mr. Howard's measurements. This setup ends up being more like Stereophile where JA does the measurements and they sometimes completely deviate from the subjective listening of the individual reviewers. Sometimes it almost feels like a "gotcha" where the reviewer loves some aspect of a speaker and then JA measures that exact thing as being terribly flawed. There's rarely an attempt to explain how or why this happens, which makes it feel disconnected compared to Tyll's approach.

As others have mentioned, I believe that measuring a few reference headphones that many listeners are familiar with will help us understand what to look for in the new graphs. A far from exhaustive list of good choices to start with:

Sennheiser HD 600/650 - Obvious choices, don't even need to elaborate

Sennheiser HD 800/800S - Common high end headphone family a lot of folks have heard

Sony MDR-V6/7506 - Not a particularly good sounding headphone, IMO, but an example of one that draws a nearly perfect flat line on a DF compensated graph, making it a good candidate to contrast with later curves

Anything by Etymotic - Not a personal favorite, but a familiar reference among IEMs

Beyerdynamic DT880, AKG K7xx family - The other two members of the old guard former flagships, both still popular choices

These are just off the top of my head, mostly because I remember them from Tyll's article on his initial measurements with the (then current version of the) Harman curve. There are of course numerous other good contenders. Reach out to the community for samples—IF and Head-Fi regulars loved to send their gear to Tyll for measurement, and I'm sure plenty of folks would be more than willing to help.

Those headphones have already been measured and reviewed by many audio outfits.
On the other hand, there are headphones which either have not been properly measured or even if they were, they have not been reviewed by competent reviewers.
Life is short, let us spend our activity and efforts in a wise and economical way.

You've missed the point entirely. I don't care about reviews or the measurements for their own sake. I care about how the new measurement rig's graphs differ from those of the old rig, since they will not be directly comparable. The only way to do that is to measure some of the same models* on both and note how they differ.

Over time, the readership here have gotten used to Tyll's measurements. I could take a look at a new headphone's graph using his setup and instantly know if it was worth my time or not, despite the fact that a perfectly flat line is not the ideal result. We got used to the quirks and learned how certain graph features that kept showing up sounded because we compared them to our own listening experience. On Tyll's graphs, for instance, the ideal treble response includes a wide, deep notch centered around about 6 kHz. Headphones that have extra energy here on his graphs tend to sound overly bright and piercing. Similarly, headphones that remain flat to 3 kHz or beyond are too bright and glaring; the ideal shape is a downward slope beginning somewhere between 1 kHz and 2 kHz.

What will any of this look like on the new measurements? Without known references, it will be impossible to know. That's what I'm suggesting: that common reference headphones be measured with the new rig so that we have an idea what the graphs for models we all know very well look like. This would give us a frame of comparison for new headphones measured on the new rig: "Okay, this one looks pretty close to the [insert headphone I like]" or "Alas, this one has [insert specific flaw] just like the [insert headphone I don't like]".

*Ideally we would measure the exact same samples, but this would be an unreasonable expectation given how much time has passed and the fact that a lot of the measurements Tyll did were on headphones others sent in for him to measure. Assuming that unit variation is minimal and therefore shouldn't upset the comparison too much is, I feel, a perfectly reasonable compromise.

Apologies for the collective reply but it’s the easiest way of responding to your many comments and questions, which I’ll address in the order they were posted.

Yes, I understand that we do have Tyll’s archive, in the form of a 1GB zip file. Exactly what the zip contains and how easy or hard it is to get at the raw data therein I don’t yet know. If some or all of it is in unfamiliar binary files then accessing it may not be a cinch – but we’ll see. Once it’s clearer what we have I’ll let you know more. Certainly it’s feasible that we can compare Tyll’s measurements with ones I’ve made of the same products and obtain a ‘correction’ that will make future measurements more easily comparable with the legacy ones.

The 8/10kHz limit on frequency response accuracy that I mentioned is a function of the IEC 711 ear simulator which mimics the acoustical impedance of the ear. This limitation can indeed be removed for THD measurement by using a microphone mounted in a hole in a flat plate or, still with the artificial pinna still in place, at the ear canal entrance to perform a ‘blocked meatus’ measurement. I can and will be experimenting with both of these possibilities as I also have to hand a G.R.A.S. 40BE capsule which I use for loudspeaker measurement. This is a quarter-inch mic with, indeed, a response out to 100kHz. But, as Kais noted, there’s a noise penalty, in this case of about 10dB (the 40BE’s self-noise is specified at 30dBA SPL), which increases the number of coherent averages required to achieve the target signal-to-noise ratio by a factor of about three.

Another possibility retains the complete artificial ear and uses what for now I’ll simply describe as a mathematical trick for extracting harmonic amplitudes out to 8/10kHz, without requiring frequency response beyond that. If it works then I’ll describe it in detail but it’s a longer-term prospect, not something that can be implemented immediately.

To clarify, the third-octave corrected responses I showed as an example in Figure 2 will each be flat if the headphone accords with the respective target response. Generally, of course, none of the corrected responses will actually be flat but may be close to it. We will publish all three, as well as the uncorrected responses.

Yes, there will be waterfall plots. The simplest to implement is cumulative spectral decay (CSD), which works well enough at the higher frequencies where diaphragm resonances generally intrude – and where different models of headphone often behave remarkably differently. There are, of course, some resonances inherent within the artificial ear but my experience is that these do not obscure the headphone’s behaviour. The time resolution/frequency resolution trade-off need not be fixed as in the CSD: a wavelet or wavelet-like approach can provide both good frequency resolution at LF and good time resolution at HF. I’m not convinced that we need take this route but if we do then there’s an elegant method of achieving it due to Gunness and Hoy.

This is a good juncture, perhaps, to make the point that we are not hog-tied in our signal processing by what any particular proprietary software solution will allow us. I write my own software to analyse headphone (and loudspeaker) measurements, which allows great flexibility both in signal processing and results presentation. As an example of what I mean in the latter context, Figure 2 (showing the third-octave corrected responses) can be difficult to interpret despite the different trace colours, especially where traces overlap. This is exactly the sort of issue which will be addressed by interactive features in our new results download file.

Re-measuring some reference headphones is a good idea and I hope I’ll get the opportunity to do it. While the focus is sure to be on new models, I would like to provide crossover both to Tyll’s measurement archive and to some stand-out past products.

No offense to Rafe Arnott intended in any way, but from where I sit, it sure seems like Keith Howard would have been a perfect fit in terms of replacing Tyll on his own. I'm not exactly sure how the two of them will make it work...

I have read and enjoyed articles by Rafe at another site. I have also read and enjoyed Keith's work at Stereophile. I can't imagine combining the two into a single headphone review and having it work out. I hope to be proven wrong though.

I think this is shaping up to be a great change. I'm excited to see where you guys go with it. I think having one man on measurements and one on reviewing the subjective side could be a brilliant combination. Glad that Mr. Katz approves of the new measurement paradigm over Tyll's as well. Also I think waterfall plots are the way to go and I'm happy to hear of their mention.

Great overview of your approach sir. One of the great and vexing things about headphones is our ability to quantify "Good" vs "Bad". I think one reason is that each of us is wired differently and have variations in how we hear sound and what we hear as pleasing. Thats why i always appreciate a review approach that balances data and experience. Sounds like you and Rafe are working together to do just that. Best of luck to you both!!

Also..worth noting..its obvious from the tone that alot of folks are hit hard by the loss of Tyll and the changes coming. Try to take that for what it is: challenge with change. There are some of us out here that are both bummed by Tylls departure but also optimistic that your and your team can do a great job with Innerfidelity. To paraphrase a movie quote "If you (re)build it..they will come".

Thanks for your continued input. My comments to your most recent posts:

I’m confident that Rafe and I will dovetail together. It would have been impossible for me to edit InnerFidelity as I don’t live in North America – and a resident is essential. Rafe, being younger than me (curse him), will also bring more energy.

InnerFidelity’s archive is one of its great assets, to which we’ll be adding. It remains to be seen what we can do to bridge the differences between Tyll’s measurements and mine but a frequency response ‘correction’, for readers who want it, should be doable. I now have Tyll’s measurement archive, which I’m pleased to say comprises Excel database files. This makes his raw data easy to get at.

Slow implementation of the measurement regime changes isn’t feasible in this instance because of the different hardware I’ll be using. And I doubt that readers would be best pleased if I didn’t perform all the planned measurements from day one. Sometimes change just has to be step-change.

The 95% confidence intervals are calculated at each measurement frequency (ie for each FFT bin) using the Student t-distribution in light of the dataset being limited to 10 distinct frequency response measurements per channel.

I suppose this is echoing Sean Olive, but I do have a dislike of third-octave smoothing. Even though you have to deal with the high-frequency "noise" of standing waves with microscale tolerance, a finer log frequency resolution tends to be more revealing of the midrange timbre and character. By reseating the headphone with sufficient trials, I would think that the mean should get close enough to be representative of the headphone's actual sound.

As neat as economical wavelets are, I have to wonder what you might expect to reveal. Single driver headphones normally make a decent approximation of minimum phase over most of the bandwidth, give or take some flight time and all-pass shift from the ear.

"While it may seem odd to test headphones using left and right pinnae which are differently shaped, this actually reflects normal use; human beings are not symmetrical either.Some headphones cope with this better than others."

This doesn't make sense. Your brain already compensates for this lack of symmetry and has actually learned to hear based on your particular differences. Having different shaped simulated ears when testing headphones only contributes to making it appear that some headphones that are not tolerant of this will appear to have higher manufacturing variance. All headphones should be tested with the exact same ear shape.