...
Doing so now, we see that none other than Dr. Lipshitz had run his own single blind tests. So of course he would have no problem with John's test being single blind as long as the protocol was such that reliable data can still be extracted.

....

Yes, but I also noticed in the provided links that he used at least 10 trials.

I'd prefer the actual article where it was first written not a reference one year later.

I searched and that was the only proper hit I found. All the rest were folks handing the quote from one person to the other thinking they were saying something bad about John, but in reality it was reflecting on their lack of knowledge of how easy it is to conduct a poorly done double-blind test. And how it is improper to grab a headline and run with it.

But sure, give me the link to the original and I will read and comment. Until then, I hope in this educated thread and forum we know better than to pass around that headline.

I'd prefer the actual article where it was first written not a reference one year later.

Good grief, can you not accept anything at face value? You remind me the of skeptic who, when asked what was the color of the cow on the hill behind him, replied "Brown -- on this side." :-)

In the article Amir linked to - http://www.stereophile.com/asweseeit/407awsi/index.html - I mentioned that I had first used the "scoundrel" phrase in the "Letters" column in the December 1996 issue of Stereophile. That "Letters" column is not reprinted on the magazine's website, so to put your mind at rest, here is the text of the letter that triggered my usage, along with my response (both copyright TEN: The Enthusiast Network):

DVD & sound quality

Editor: The October '96 issue of Mix magazine featured an article on DVD by Philip DeLancie: "DVD---Almost Ready for Prime Time?" Commenting on the possibility of DVD-based audio discs, DeLancie wrote: "As for the fidelity benefits of higher sampling rates and greater bit depths, I suspect that in true double-blind comparative listening tests, few listeners (in the music industry) would be able to tell much difference between a standard CD and a 24-bit/96kHz DVD for most types of music. Indeed, there is so much room for improvement in other links of the audio chain, all the way from microphone placement through to home speaker design, that the fidelity of the CD can hardly be singled out as a limiting factor for contemporary musical enjoyment."

Would anyone on the Stereophile staff care to rebut this statement? - no name given
Photogcw@vvm.com

While I regard "double-blind comparative listening tests" as the last refuge of the agenda-driven scoundrel, how many listeners would be able to tell the difference is indeed the question! From my own experience of working with digital music data having a nominal 20-bit resolution, I would say that Philip DeLancie is just plain wrong---but whether a dedicated High Quality Audio Disc is a commercial proposition still depends on how many people like me there are in the world who would pay for that quality (see the next letter). However, the question may be moot. As I wrote in November's "As We See It" (p.3, http://www.stereophile.com/asweseeit/1196awsi/index.html ), once DVD-ROM is widely available, it is perfectly feasible to use it to distribute 96kHz-sampled, 24-bit multichannel recordings without there being an agreed-upon audio hardware standard. -- John Atkinson

I also noticed in the provided links that [Stanley Lipshitz] used at least 10 trials.

There is always the issue of listener fatigue to deal with in blind tests. A late-1980s AES paper by Willy Hansen, then with Bang & Olufsen, based on work performed as part of the European Eureka Project, showed that the time allowed for a formal blind test shouldn't be more than (IIRC) 45 minutes, if listener fatigue is not to become an interfering variable. In my San Mateo tests, we gave 9 presentations in each session, with the first 2 sighted for training. Each session lasted about 40 minutes.

You know, there's a bit of irony in name dropping Lipshitz when one considers the following quote: "I regard 'double-blind comparative listening tests' as the last refuge of the agenda-driven scoundrel.
JOHN ATKINSON, Stereophile (December 1996, page 23)"

I don't know why it is "name dropping" to mention Stanley. He and I have known each other since we first met at the London AES convention in 1980, and while we disagree on some things, he was kind enough to write the INS a testimonial on my behalf when I moved to the United States to take the job as editor of Stereophile.

Quote:

Originally Posted by Chu Gai

Maybe there's more to it and if so I'd like to read the entirety of it but as it stands, well...

Just sayin...

Yup, that's what I wrote, as a reaction to having taken part as a listener in too many "scientific tests" organized by third parties where the protocol had been rigged to give the outcome the proctor wanted.

For example, there was a blind test of surround-sound processors at an early 1990s AES Convention where the scores of the best and worst-performing processors were lumped together on the grounds that they were both Dolby ProLogic designs and thus should have performed identically. When that was done, it was "proved" that the listeners could not detect the benefit of ProLogic.

Or at another AES convention, in a blind test of speaker cables, the proctor ignored complaints from the listeners that the sound of the test speakers was being picked up by his podium mike and thus was able to "prove" that listeners could not distinguish between heavy-gauge cable and 24-gauge.

Or at yet another AES convention, where TDK sponsored a series of blind tests that "proved" that the listeners were unable to distinguish a CD from a cassette copy of it.

The examples, sadly, are legion.

John Atkinson
Editor, Stereophile

Answered somewhat in reverse...

You're right in that there are examples of poorly done and contrived tests,sometimes with cherry picked or falsified data, sometimes with conclusions that far outstrip the findings, that can be found in audio as well as other fields. One need only look at Dr. Oz's weekly pronouncements for the next weight loss product or the discredited work on vaccination and autism. But in the interests of perspective John, I don't see you presenting an equally compelling and proportionally weighted case for blind tests that were done fairly well.

As to name dropping well you did. There was no independent, ideally by Lipshitz, statement specifically stating what it was that he liked. And afaik, whatever it was that he specifically said does not carry with it a subsequent endorsement of the data analysis. By mentioning his name in passing, it can serve to imply that the excercises you undertook have his stamp of approval from cradle to grave. It may not have been what you intended but...

Being as you live in Brooklyn, you can't be all bad though. That is unless you don't make periodic pilgrimages to Spumoni Gardens.

"I've found that when you want to know the truth about someone that someone is probably the last person you should ask." - Gregory House

Just wanted to see the original context where the phrase was used. And really, in this world accepting things at face value is a good way to get hoodwinked. 15 day waiting period for the VA? Troops are leaving the Easter border of Ukraine. I've never failed a doping test. Dinosaurs and man walked the earth together. Chimps don't exhibit premeditated violent behavior. All the animals rushed to higher ground during the tsunami. I did not have sex with that woman. Skeptical?

"I've found that when you want to know the truth about someone that someone is probably the last person you should ask." - Gregory House

So, if I understand your protocol, you are testing a component, in this case an amp, to a direct wire bypassing that component?
Is the signal source capable of driving a speaker? Do you test CD players in this manner?

If you look at the test setup, the amp under testing is loaded using an artificial load simulating a speaker load. From the load, the signal is attenuated to the same level as the input and fed into the amplifier driving the monitor speakers. By using this setup they know that any sonic differences between the two reproduction chains comes from the DUT not being a fully transparent component.

You can't test sources using this method. What Swedish AES has done to make verify the source they use does not colour the sound in any detectable way is to use an high quality analog master tape as reference, and then compare that to the analog master tape processed through their A/D and then back through their D/A.

If you look at the test setup, the amp under testing is loaded using an artificial load simulating a speaker load. From the load, the signal is attenuated to the same level as the input and fed into the amplifier driving the monitor speakers. By using this setup they know that any sonic differences between the two reproduction chains comes from the DUT not being a fully transparent component.

You can't test sources using this method. What Swedish AES has done to make verify the source they use does not colour the sound in any detectable way is to use an high quality analog master tape as reference, and then compare that to the analog master tape processed through their A/D and then back through their D/A.

Thanks. After going over that diagram a few times and what Arny mentioned, I understand how it was hooked up and why you don't test source components.
But, I still don't see how and why this is a credible protocol. All you are testing, maybe, if an amp, in this case, is transparent to a straight wire not that two amps are audibly different. And, even if it is not transparent as a wire it doesn't mean that it will be audibly different from another amp or the amp that is actually driving the speakers in that setup when compared to each other. As it is, you are not comparing the two amps.

Mr. Causey offers no detail about his comparisons—about whether they were performed blind or not, or with levels matched or not. But I will note that it is trivially easy to organize a formal test that produces a null result even when real differences exist (footnote 2).

Footnote 2: Hence my writing back in the December 1996 issue's "Letters" (p.23) that I "regard 'double-blind comparative listening tests' as the last refuge of the agenda-driven scoundrel," a statement that got Mr. Aczel sorely vexed.

If so he could not have said it better. As I noted in my last post, people use anything that says "double blind" as some kind of "scientific data" to beat up the other guy when in reality they have not understood what the test is about and in this case, who had even written it!. As John nicely puts it, it is the last refuge of the agenda-driven scoundrel, present company excluded of course .

As John so correctly puts it, it is abundantly easy to put together tests that show no difference. What is hard is one that does differentiate. That is why I commend Arny so much on his amplifier test that showed differences in his double blind testing. That one report is worth more than 50 others that didn't find a difference.

Let me remind everyone what our beloved mentor has said but sadly ignored in so many of these tests with negative outcomes:

Quote:

Originally Posted by arnyk

Good luck. The obvious questions would relate to what you intend to test and how.

Here are some guidelines to follow:

Ten (10) Requirements For Sensitive and Reliable Listening Tests

(1) Program material must include critical passages that enable audible differences to be most easily heard.

(2) Listeners must be sensitized to a audible differences, so that if an audible difference is generated by the equipment, the listener will notice it and have a useful reaction to it.

(3) Listeners must be trained to listen systematically so that audible problems are heard.

(4) Procedures should be "open" to detecting problems that aren't necessarily technically well-understood or even expected, at this time. A classic problem with measurements and some listening tests is that each one focuses on one or only a few problems, allowing others to escape notice.

(5) We must have confidence that the Unit Under Test (UUT) is representative of the kind of equipment it represents. In other words the UUT must not be broken, it must not be appreciably modified in some secret way, and must not be the wrong make or model, among other things.

(6) A suitable listening environment must be provided. It can't be too dull, too bright, too noisy, too reverberant, or too harsh. The speakers and other components have to be sufficiently free from distortion, the room must be noise-free, etc..

(7) Listeners need to be in a good mood for listening, in good physical condition (no blocked-up ears!), and be well-trained for hearing deficiencies in the reproduced sound.

(8) Sample volume levels need to be matched to each other or else the listeners will perceive differences that are simply due to volume differences.

(9) Non-audible influences need to be controlled so that the listener reaches his conclusions due to "Just listening".

(10) Listeners should control as many of the aspects of the listening test as possible. Self-controlled tests usually facilitate this. Most importantly, they should be able to switch among the alternatives at times of their choosing. The switchover should be as instantaneous and non-disruptive as possible.

See the top commandments I have highlighted? That is what John is saying. I guess great men think alike!

Quote:

Just sayin...

Me too . The problem is that folks are not listening.....

There's a funny story about the "10 requirements". The origin of it is a discussion between John and I on Usenet. Maybe the first 5 or so came out of that discussion - they are as much of paraphrases of his contributions to the conversation as anything.

They were on the home page of my old www.pcabx.com web site, and I still take them very seriously. BTW since you seem to be here and are reading this thread, Thanks, John.

But, I still don't see how and why this is a credible protocol. All you are testing, maybe, if an amp, in this case, is transparent to a straight wire not that two amps are audibly different.

It is a very credible protocol. Why would you want an amp not to act as a "wire with gain"? Why would anyone interested in the highest level of reproduction accuracy settle for a component that distorts the signal audibly, if there are alternatives that does not distort the signal?!? Swedish AES is interested in finding transparent components that are good value, and the testing scheme they have come to adopt tests just that.

Quote:

Originally Posted by CharlesJ

And, even if it is not transparent as a wire it doesn't mean that it will be audibly different from another amp or the amp that is actually driving the speakers in that setup when compared to each other. As it is, you are not comparing the two amps.

An A/B or an ABX test as it is traditionally setup may end up comparing two possibly flawed DUTs against eachoter. If they are "equally flawed" you will not be able to distinguish between them. With this test setup, the BAX test if you will, Swedish AES has come up with a test that is basically comparing the output of the DUT to an absolute reference. They happen to use a reproduction chain that has been selected using their methodology an thus is exceptionally free of coloration, but they are very clear into their communication that this test setup works and gives accurate information about the DUTs coloratins even if the rest of the reference reproduction chain has pretty significant coloration. The only thing that differs between B and A is that the DUT is in the signal path, so if you get an audible difference it is due to the DUT messing up the signal.

IMHO it is a extraordinary clever test setup that they have come up with to find components that are truly transparent.

It is a very credible protocol. Why would you want an amp not to act as a "wire with gain"? Why would anyone interested in the highest level of reproduction accuracy settle for a component that distorts the signal audibly, if there are alternatives that does not distort the signal?!? Swedish AES is interested in finding transparent components that are good value, and the testing scheme they have come to adopt tests just that.
An A/B or an ABX test as it is traditionally setup may end up comparing two possibly flawed DUTs against eachoter. If they are "equally flawed" you will not be able to distinguish between them. With this test setup, the BAX test if you will, Swedish AES has come up with a test that is basically comparing the output of the DUT to an absolute reference. They happen to use a reproduction chain that has been selected using their methodology an thus is exceptionally free of coloration, but they are very clear into their communication that this test setup works and gives accurate information about the DUTs coloratins even if the rest of the reference reproduction chain has pretty significant coloration. The only thing that differs between B and A is that the DUT is in the signal path, so if you get an audible difference it is due to the DUT messing up the signal.

IMHO it is a extraordinary clever test setup that they have come up with to find components that are truly transparent.

All amplifiers are flawed! There isn't one exception! However, they are all flawed differently!

It is a very credible protocol. Why would you want an amp not to act as a "wire with gain"? Why would anyone interested in the highest level of reproduction accuracy settle for a component that distorts the signal audibly, if there are alternatives that does not distort the signal?!? Swedish AES is interested in finding transparent components that are good value, and the testing scheme they have come to adopt tests just that.
An A/B or an ABX test as it is traditionally setup may end up comparing two possibly flawed DUTs against eachoter. If they are "equally flawed" you will not be able to distinguish between them. With this test setup, the BAX test if you will, Swedish AES has come up with a test that is basically comparing the output of the DUT to an absolute reference. They happen to use a reproduction chain that has been selected using their methodology an thus is exceptionally free of coloration, but they are very clear into their communication that this test setup works and gives accurate information about the DUTs coloratins even if the rest of the reference reproduction chain has pretty significant coloration. The only thing that differs between B and A is that the DUT is in the signal path, so if you get an audible difference it is due to the DUT messing up the signal.

IMHO it is a extraordinary clever test setup that they have come up with to find components that are truly transparent.

Thanks for the added explanation. I need to think about this but may not have enough expertise to see if there is a flaw in this methodology, or not.

Your fallacy is that you're claiming that I said that I can hear audible flaws in all amplifiers, no exceptions. This is an utter fabrication. Please show me where I've said such a thing, ever!

Funny you should mention DBT's; I'm currently organizing one. You might want to check out the thread;) !

To my knowledge, I've been discussing nothing but testing for *audible* coloration in the post you quoted, so the "lack in preservation of ones context, is a pretext for fallacy"... maybe want to rethink that argument? No?

To my knowledge, I've been discussing nothing but testing for *audible* coloration in the post you quoted, so the "lack in preservation of ones context, is a pretext for fallacy"... maybe want to rethink that argument? No?

Not at all. I made a comment on a truism about amplifiers, you assumed something about my statement and made a comment based on your assumption, instead of asking me to clarify. Your making the same mistake again.

My comment was intended to support your position.

I am in your camp, we just happen to be on opposite sides of the camp fire at this moment.

Not at all. I made a comment on a truism about amplifiers, you assumed something about my statement and made a comment based on your assumption, instead of asking me to clarify. Your making the same mistake again

Yes, I undersand, you just mistakenly fell upon random keys of your computer, accidentally quoting my -to your truism- completely unrelated post upon publishing your said amplifier truism. I kind of remember another person with the same unfortunate chain of events that you have eledgedly suffered from as being misrepresented:

Yes, I get it, you just mistakenly fell upon the your computer, accidentally quoting my totally unrelated post upon publishing your amplifier truism. I kind of remember another person with the same unfortunate unortune of being misrepresented:

As far as I am concerned, that is the end of the matter.

I'm glad to have been of some psychological/emotional benefit to you... The session has ended!

I'm glad to have been of some psychological/emotional benefit to you...

I am sorry to say, but you have not. *Anyone* re-reading your ramblings from, say, http://www.avsforum.com/t/1532092/debate-thread-scotts-hi-res-audio-test/330#post_24779786 and forward will understand who is having psychological/emotional issues with the topic at hand and who has not. I hope in the future you can have a mature discussion about what has been said and what has not and in what context, but as of now I just do not see that happening.

It is a very credible protocol. Why would you want an amp not to act as a "wire with gain"? Why would anyone interested in the highest level of reproduction accuracy settle for a component that distorts the signal audibly, if there are alternatives that does not distort the signal?!? Swedish AES is interested in finding transparent components that are good value, and the testing scheme they have come to adopt tests just that.
An A/B or an ABX test as it is traditionally setup may end up comparing two possibly flawed DUTs against eachoter. If they are "equally flawed" you will not be able to distinguish between them. With this test setup, the BAX test if you will, Swedish AES has come up with a test that is basically comparing the output of the DUT to an absolute reference. They happen to use a reproduction chain that has been selected using their methodology an thus is exceptionally free of coloration, but they are very clear into their communication that this test setup works and gives accurate information about the DUTs coloratins even if the rest of the reference reproduction chain has pretty significant coloration. The only thing that differs between B and A is that the DUT is in the signal path, so if you get an audible difference it is due to the DUT messing up the signal.

IMHO it is a extraordinary clever test setup that they have come up with to find components that are truly transparent.

If I remember you have found some amps to be transparent with this methodology.
By chance the society used that amp and compared that/those to other competent amps but not transparent using the traditional ABX method and been able to differentiate between them? I would think if this can be consistently accomplished between the so called transparent amps and all the others using the traditional method that that could validate it, no?

I am sorry to say, but you have not. *Anyone* re-reading your ramblings from, say, http://www.avsforum.com/t/1532092/debate-thread-scotts-hi-res-audio-test/330#post_24779786 and forward will understand who is having psychological/emotional issues with the topic at hand and who has not. I hope in the future you can have a mature discussion about what has been said and what has not and in what context, but as of now I just do not see that happening.

Hmmm

You make up a false quote; then emotional defend it, repeatedly, even after I gently attempt to turn you away from your assumptive and truth perverting ways.

As I have said, my post was made in support of your statements; you miss understood and have remained untoward ever since.

Obviously, you were attempting to mislead all of us again when you stated: "As far as I am concerned, that is the end of the matter. "

Guys let's be nice to each other and remain focused on technical topics please. Thanks.

And I am sure I speak for everyone when I say that you've been a model poster in going the extra mile when it comes to consistently being nice to everyone. I never see you goading people, looking to humiliate posters, or veering from the topic at hand. Judiciously peppering your words with smiley face emoticons convincingly demonstrates your pure heart and goodwill. After all, wihy would anyone use a ?

But, I still don't see how and why this is a credible protocol. All you are testing, maybe, if an amp, in this case, is transparent to a straight wire not that two amps are audibly different.

It is a very credible protocol. Why would you want an amp not to act as a "wire with gain"? Why would anyone interested in the highest level of reproduction accuracy settle for a component that distorts the signal audibly, if there are alternatives that does not distort the signal?!? Swedish AES is interested in finding transparent components that are good value, and the testing scheme they have come to adopt tests just that.

Well stated and totally agreed.

Quote:

Quote:

Originally Posted by CharlesJ

And, even if it is not transparent as a wire it doesn't mean that it will be audibly different from another amp...

Agreed.

Quote:

or the amp that is actually driving the speakers in that setup

I think this is the crux of people's reservations about straight wire bypass tests - there is always this additional amplifier in the signal chain which is unusual. I will treat this issue below.

Quote:

when compared to each other. As it is, you are not comparing the two amps.

It turns out that this is good news on several important grounds, which I will also deal with below.

Quote:

An A/B or an ABX test as it is traditionally setup may end up comparing two possibly flawed DUTs against each other. If they are "equally flawed" you will not be able to distinguish between them.

While that is true in theory, the probability of two amps being both audibly flawed and sonically indistinguishable is IME low.

Quote:

With this test setup, the BAX test if you will, Swedish AES has come up with a test that is basically comparing the output of the DUT to an absolute reference. They happen to use a reproduction chain that has been selected using their methodology an thus is exceptionally free of coloration, but they are very clear into their communication that this test setup works and gives accurate information about the DUTs coloratins even if the rest of the reference reproduction chain has pretty significant coloration. The only thing that differs between B and A is that the DUT is in the signal path, so if you get an audible difference it is due to the DUT messing up the signal.

Agreed on all points.

Quote:

Quote:

IMHO it is a extraordinary clever test setup that they have come up with to find components that are truly transparent.

I do object to calling it the Swedish AES test because as I will show conclusively below, they did not originate it.

This paper, "Audible Amplifier Distoriton Is Not a Mystery" by Peter Baxandall and published in Wireless World in November 1977 describes what I believe to be essentially the same test:

Note: This article is old, old, old and I can reasonably predict that someone who should know better will start quoting paragraphs from it like they were revealed truth when they are merely out of date and should be ignored.

I believe that all of the elements of the Swedish test setup and then some are shown above. The above article has a serious lacking, which is the lack of a DBT test coordinator such as an "ABX Comparator". This omission may be excused on the grounds that the existence of the ABX Comparator was not disseminated to the public at the time.

About the second amplifier (or as Baxandall puts it "The Monitoring system") in a straight wire bypass test

Most audiophiles have never hooked the speaker terminals of one amplifier to the input terminals of another, so doing so thus triggers fear of the unknown and attendant anxiety if not revulsion.

In fact this can be done and and if done correctly there will often be outstandingly good results. The essence of a correct connection of the output of a power amplifier into the input of another is reducing the voltage level of the first power amplifier so that the second power amplifier is not overdriven. This is commonly done in automotive audio systems to the extent that hardware for the purpose is readilly available:

In a DBT this amplifier need only be good enough to avoid biasing the experiment's outcome. Some have called for this amplifier to have far less distortion than an amplifier under test, which is excessive and may even be practically impossible. If fact the amplifier that is part of the monitoring system need only be good enough to avoid masking any audible artifacts created by the UUT. IOW this amplifier's performance requirements are more likely to be determined by the relatively easy to meet limitations of the human ear, and not necessarily the limitations of the amplifier SOTA. For example its noise floor could be as poor as that of the musical program material used to run the test which is an easy bar for a good amplifier to hurdle. Its frequency response variations need only be good enough that the frequency response variations in the UUT are not masked. This allows the use of amplifiers with larger frequency response variations within the audible range than the UUT, just not excessively so. Obviously, frequency response variations and distortion and noise outside the audible range is irrelevant so long as it does not affect sounds within the audible range.

In 1977 there were a fair number of sonically transparent power amplifiers, but they were not almost entirely the rule like they are today. Nevertheless my comments should not be interpreted as a blank check to use seriously degraded power amps as monitor amps in amplifier DBTs. I'm just trying to reduce angst and misapprehensions over a long standing issue to something that is manageable.