I found this thread among SoundExpert referals and was a bit surprised with almost complete misunderstanding of SE testing methodology and particularly how diff signal is used in SE audio quality metrics. Discussion on the topic from 2006 actually seems more meaningful. So I decided to post here some SE basics for reference purposes. I will use a thought experiment which is close to reality though.

Suppose we have two sound signals – the main and the side one. They could be for example a short piano passage and some noise. We can prepare several mixes of them in different proportions:

equal levels of main and side signals (0dB RMS)

half level of side signal (-6dB RMS)

quarter level of side signal (-12dB RMS)

1/8 level of side signal (-18dB RMS)

1/16 level of side signal (-24dB RMS)

After normalization all mixes have equal levels and we can evaluate perceptibility of the side signal in the mixes. Here at SE we found that this perceptibility is a monotonous function of side signal level and looks like this:

(1) In other words, there is a relationship between objectively measured level of side signal and its subjectively estimated perceptibility in the mix. And what is more:

(a) this relationship is well described by 2-nd order curve (assuming levels are in dB)(b) the relationship holds for any sound signals whether they are correlated or not, the only differences are position and curvature of the curve.

(2) These side stimulus perceptibility curves are the core of SE rating mechanism. Each device under test has its own curve plotted on basis of SE online listening tests. (3) Side signals are difference signals of devices being tested. Levels of side signals are expressed in dB of Difference level parameter which is exactly equal to RMS level of side signal in our case.(4) Subjective grades of perceptibility are anchor points of 5-grade impairment scale.(5) Audio metrics beyond threshold of audibility is determined by extrapolation of that 2-nd order curves. Virtual grades in extrapolated area could be considered as objective quality parameters regarding human auditory peculiarities.

So, yes, difference signal is used in SE testing. We take into account both its level and how human auditory system perceives it together with reference signal. Some difference signals having fairly high levels still remain almost imperceptible against the background of reference signal and vice versa; perceptibility curves reflect this.

This is the concept. Many parts of it still need thorough verification in carefully designed listening tests, which are beyond SE possibilities. All we can do is to analyze collected grades returned by SE visitors. This will be done for sure and yet this can't be a replacement of properly organized listening tests.

SE testing methodology is new and questionable, but all assumptions look reasonable and SE ratings – promising, at least to me. Time will show.

Shouldn't it be a flat line once you reach "imperceptible"? If not, once something is imperceptible, how can it become "more imperceptible"?

Matter of definition, interpretation and use.

1) Consider three chess games which are both "theoretically lost". One is a simple mate in one, the other is so hard that if you put 1000 chess players at the task, you won't be able to distinguish it from the startup position by statistical analysis of the outcome. And, the third is so hard that it won't be solved in fifty years. To clearcut the logic, assume that the second is like the third, except with 70 intermediary "only moves" (which do not constitute any learning curve for the subsequent ones).

Now everything else equal, you will still have a clear strict preference. Because you could risk meeting one of the very few chess players that actually can win this. You might not know that is is "humanly winable" though, but you will absolutely want to insure against the uncertainty if it is free.

Now consider a step-by-step sequence of chess positions, starting from the "third" one above. We index them by "# of very hard moves until the win is clear, as measured by statistics within confidence level [say, p]". How do you define the human-winability threshold?

2) Consider 32-bit sound file, then a 31 bit (LSB truncated) file, etc. Rank these. You may claim that every file above a "hearing threshold" of slightly below T bits, is equivalent. However, what if it is an unfinished product? Are you sure that the final mix is going to have the same hearing threshold? If not, then the high-resolution file could very well be more robust -- there might be manipulations which would enable you to hear a difference between the final and its T-bit version, although not between the original and its T-bit version. Most 16-bit CDs are mixed at higher word length, right?Solution? A "robustness-to-manipulations" measure?

Of course:- if no such issues apply, then zero value to superfluous information is at least as good a measure as everything else- if anyone makes a selling claim, then they have the burden of proof. Then "inaudible difference" is the null hypothesis. You would grab the extra measured quality if for free, as an insurance against audibility, but you would frown upon someone trying to sell you an insurance against a disaster which no-one has ever substantiated has ever happened or could ever happen. (... well ...: http://en.wikipedia.org/wiki/Alien_abduction_insurance )- even if we assume that there is some worth to this not-justified-as-generally-audible quality, then it is hard to quantify. Justifying it exists (by measurement) does not mean we can justify a reasonably narrow confidence interval for a particular point on the graph.