08 October, 2012, 02:21:38 PM

I've noticed an increase in forum debate about the validity of transferring the credibility of ABX from the physical domain to perception testing. I'm wondering if anyone has found a way past this issue?

The purpose of blind testing is to subtract subjectivity from the effect of - for instance - a drug trial: to assess a medication's impact on a subject's physiology with interference from their psychology. But what about when the purpose of a test is subjective perception? How do we then subtract the effect of the method to arrive at a meaningful outcome?

While we would like to remove expectation bias from the equation, if the conditions under which this is done also change the perceptive state of the listener, the test is invalidated as surely as they would be by tissue sample contamination.

Recent large scale public experiments by Lotto Labs (http://www.lottolab.org/) demonstrated that perception acuity is dramatically altered by test conditions: for instance, that time contraction/dilation effects are experienced when exposed to colour fields. In one experiment, two groups were asked to perform an identical fine-grained visual acuity test. One group was pre-emptively 'manipulated' by filling in a questionnaire designed to lower their self-esteem. This 'less confident' group consistently performed worse on the test that the unmanipulated one: their acuity was significantly impaired by a subtle psychological 'tweak' that wasn't even in effect during the test.

It seems undeniable that the much grosser differences between the mental states of sighted and 'blind' listening - considered generously - cast serious doubt on the results thus obtained.

The harder line is that blind perception tests are a fundamental misappropriation of methodology. In psychology it's axiomatic that for many experiments the subject must be unaware of the nature of the test (see Milgram). If a normalised state is not cunningly contrived, results are at best only indicative of what a subject thinks they should do; at worst, entirely invalid.

Probing hearing, the point is that a test must not change the mental state of the listener.

The contrast between outcomes of sighted and listening tests is as stark as those demonstrating suggestibility (see McGurk), but giving too much credence to such an intrinsically unsound experimental approach (not spotting this difficulty) does no favours to our credibility at all.

The only way past the dilemma seems to be direct mechanical examination of the mind during 'normal' listening to explore why the experiences of sighted and unsighted listening differ. This seems to be an interesting question.

In the meantime, the idea that - despite the method problem - results from blind ABX are valid is at least supported by the majority of data derived from home testing, Audio DiffMaker et al, so we needn't get hung up on it.

Despite common shorthand, ABX is not a test. It is a method. A method equally at home in a variety of settings.

Second you appear to be picking and choosing your literature. You attack the idea of audio testing though a broadside against tests where the subjects are aware they are taking a test. Ignoring the irrelevant mention of purposeful manipulation of tester state (are you honestly proposing that HA's "default" ABX routine somehow systematically influences the self esteem of participants?) you are ignoring the fact that there is no literature supporting your presupposition that tester awareness diminishes perceptual acuity.

Thirdly you (without backing) conclude that "direct mechanical examination of the mind during 'normal' listening" must be employed. Failing to defend, in the slightest, the idea that perception can be differentiated from subconscious neural activity. I can directly measure my pulse rate through a variety of methods. That is far from even hinting, much less proving, conscious perception of my pulse rate.

To rephrase: what, in any trial, is blind testing designed to filter out?

Not an attack - no need to be combative! - it's self-evident that psych evaluations often depend crucially on the subject not being aware of the purpose of the test. Why?

I think you may have misunderstood the Lotto Labs experiments I referred to: maybe check them out. They aren't about manipulating the tester's state: they explore how (surprisingly) easily perceptual states are changed by environmental conditions: changing the subject's mind changes the subject's mind . . . .

The degree to which perception can be differentiated from subconscious neural activity is a whole different (and tangential) question.

I think you may have misunderstood the Lotto Labs experiments I referred to: maybe check them out. They aren't about manipulating the tester's state: they explore how (surprisingly) easily perceptual states are changed by environmental conditions: changing the subject's mind changes the subject's mind . . . .

Not in the slightest. They aren't about manipulating the tester's state, but they are dependent upon manipulation of tester's state.

Regardless, you dodged the question. Call it "tester's state" call it "environmental conditions", call it what you will. Where is your argument, much less your evidence, that the manner of ABX testing practiced creates a systematic bias in "environmental conditions"? The Lotto experiments were dependent on such a systematic influence. If you can't demonstrate one They Are Not Relevant.

To rephrase then: what, in any trial, is blind testing designed to filter out?

Bias

Partly, yes: more specifically, in the clinical domain 'blindness' separates the psychological from the physiological. From the perspective of a drugs trial, psychological factors are generally extraneous and need to be excised from the process. From the perspective of an auditory trial, 'psychological factors' are the subject of the test.

I think you may have misunderstood the Lotto Labs experiments I referred to: maybe check them out. They aren't about manipulating the tester's state: they explore how (surprisingly) easily perceptual states are changed by environmental conditions: changing the subject's mind changes the subject's mind . . . .

Not in the slightest. They aren't about manipulating the tester's state, but they are dependent upon manipulation of tester's state. Regardless, you dodged the question. Call it "tester's state" call it "environmental conditions", call it what you will. Where is your argument, much less your evidence, that the manner of ABX testing practiced creates a systematic bias in "environmental conditions"? The Lotto experiments were dependent on such a systematic influence. If you can't demonstrate one They Are Not Relevant.

Again, the state of the tester is irrelevant: it's the subject we're interested in - the testees, if you will - and the test environment. You're not making yourself clear: are you saying you don't like the conclusions of the experiments I referred to? Or that environment makes no difference to perception? Or are you claiming that listening to music for pleasure, through known equipment, is materially the same as listening analytically under test conditions, 'blind' to what you're hearing?

The degree to which perception can be differentiated from subconscious neural acticity is a whole different (and tangential) question.

Agreed, it is totally off topic and undefended. But it is one you brought up.

As it seems to me that you introduced this irrelevant topic - and apparently vice versa - can we agree to move on?! I don't want to get bogged down redefining the problem, but would like to begin discussing solutions . . .

He expressed surprise that such a subtle a priori manipulation of the subject's self-esteem (of all things!) would depress visual acuity to the extent it did. Particularly germane to this discussion is that the test required subjects to distinguish two subtly different colours.

That particular experiment neatly illustrates the problem, but the issue doesn't hinge on any single experiment: the point is general to all such tests. If the method in any way modifies the 'normal' state of the listener, the data will be invalid, and subsequent statistical analysis is a fool's errand. Attempts to borrow the credibility of drug trials for the purpose of a perception test fatally misunderstands the purpose of such testing and is (at best) sloppy.

Although it's tempting to reach for the conclusion that almost everything is identical, there is an equally valid interpretation of results generated by DBT tests which invariably demonstrate a diminution of differences (oranges become more like lemons, Stradivari become more like toys, speakers become more like speakers) that seem apparent when sighted: namely that the method itself results in a diminution of differences. The more 'blunt tool' results emerge from blind perception tests, the less credible they look.

Partly, yes: more specifically, in the clinical domain 'blindness' separates the psychological from the physiological. From the perspective of a drugs trial, psychological factors are generally extraneous and need to be excised from the process. From the perspective of an auditory trial, 'psychological factors' are the subject of the test.

They can be so in medical DBTs too, e.g, the effectiveness of a drug or treatment as a pain reliever.

Quote

Although it's tempting to reach for the conclusion that almost everything is identical, there is an equally valid interpretation of results generated by DBT tests which invariably demonstrate a diminution of differences (oranges become more like lemons, Stradivari become more like toys, speakers become more like speakers) that seem apparent when sighted: namely that the method itself results in a diminution of differences. The more 'blunt tool' results emerge from blind perception tests, the less credible they look.

This is an argument from incredulity. You just can't believe that your/our sighted perceptions could be so *wrong*.

Explain, then, how people DO get it so wrong, in the case where two bottles of the same wine 'taste' vastly different, depending on how they are labelled. Or in the case where the listener 'hears' a vast difference between unit A and unit B, when in fact unit B has never even been put into the circuit.

@item:Perhaps you could share with us a little about who you are so that we can put your point of view into proper perspective.

This site has TOS #8 in place to keep the signal to noise ratio high. As has been aptly pointed out, sighted tests provide absolutely no guarantee of reliability, whereas positive double-blind tests do. The only concern on the table that could possibly have any validity (I am being generous) is that double-blind testing might reduce the sensitivity of the person taking the test. This is perfectly fine since a failed DBT is not used as universal proof that two things must sound the same which is generally where those arguing on behalf of placebophiles get tripped up. FWIW, as a professional tester I can tell you that I actually pay closer attention to detail when I am consciously involved in a test, despite DBT skeptics and snake oil salesmen telling me that I can't or don't.

Last Edit: 08 October, 2012, 11:56:52 PM by greynol

Is 24-bit/192kHz good enough for your lo-fi vinyl, or do you need 32/384?

a failed DBT is not used as universal proof that two things must sound the same

Quoted for truth. Or more generally: we are seeking evidence for differences. Just because a test failed to verify, it does not imply it falsifies.

There are many cases where a DBT would fail to demonstrate (real) differences. E.g., there could be a difference only 1 out of 3 could ever detect. Then you (as one person) would more likely than not fail. Or it could be that your hit rate is only slightly above the 50/50 mark. Better than coinflipping, but you need a bit of data to actually verify it. Not enough data --> no verification. ... but then: would any of these issues be resolved by simply equipping the test subject -- or the person administering the test -- with a set of bias-provoking prejudices?

Now there are cases where DBTs are not too feasible. Heart transplants, for example. You don't open a patient, remove the heart and blindfold the surgeon before flipping the coin over whether the old or the new heart is to be inserted (and then make sure the patient does not know). And who has tried to DBT different live venues? That does not mean that the double-blindness is bad, just that it might be hard to obtain.

He expressed surprise that such a subtle a priori manipulation of the subject's self-esteem (of all things!) would depress visual acuity to the extent it did.

[...]

Although it's tempting to reach for the conclusion that almost everything is identical, there is an equally valid interpretation of results generated by DBT tests which invariably demonstrate a diminution of differences (oranges become more like lemons, Stradivari become more like toys, speakers become more like speakers) that seem apparent when sighted: namely that the method itself results in a diminution of differences.

You can of course link to reliable scientific metastudies documenting that 'invariably' and that the differences that disappear, are for real? Not that the effect is proven every now and then in certain setups, but that it 'invariably' is so?

And still this is no excuse for uncontrolled tests. Under the (hardly controversial) hypothesis that test setups can manipulate the test subjects in a manner crucially affecting the results, one takes measures to test ceteris paribus; that is, at least until one can establish that a certain manipulation introduced to the test setup is more likely to get you true answers (not only cocksure answers) and can be administered reliably. Good if we can find one (and in certain setups one can isolate the effect of this and correct for it, which is unlikely to happen if a marketing guy is about to convince you).

Look, the scientifical conventions are themselves biased towards the null hypothesis: one has accepted standards which by themselves fail to accept lots of true answers, this because one does not want to accept false answers. That means that a lot of facts will have the status as unconfirmed hypotheses for a long time. If the only test result that could possibly indicate a difference is prone to indicate a false difference stemming from mumbojumbo marketing, then one should not accept it as evidence. Too bad if the difference is for real, but that's the cost of not being gullible. (Too bad if the millions I was offered from Nigeria yesterday were for real too, but heck ...)

Although it's tempting to reach for the conclusion that almost everything is identical, there is an equally valid interpretation of results generated by DBT tests which invariably demonstrate a diminution of differences (oranges become more like lemons, Stradivari become more like toys, speakers become more like speakers) that seem apparent when sighted: namely that the method itself results in a diminution of differences. The more 'blunt tool' results emerge from blind perception tests, the less credible they look.

This is an argument from incredulity. You just can't believe that your/our sighted perceptions could be so *wrong*.

Easy tiger: that's a straw man - heck, can we do better than reach for these lazy forumspeak dismissals? Not an argument from incredulity: I'm simply describing results generated by DBT perception testing in the least controversial manner possible, but referring to an equally valid interpretation.

Explain, then, how people DO get it so wrong, in the case where two bottles of the same wine 'taste' vastly different, depending on how they are labelled. Or in the case where the listener 'hears' a vast difference between unit A and unit B, when in fact unit B has never even been put into the circuit.

That's the interesting question, isn't it? It's desirable to remove the powerful filter of expectation bias in such tests - in an attempt to reach the 'actual experience' rather than the 'perceived experience' of the subject (becoming a slippery concept at this point). It's self-explanatory that such bias operates negatively: we don't need further proof that people see and hear things that aren't there if they receive sufficient suggestion. Blah blah QED.

But what is expectation bias and why is it such a strong force in our perception? Broadly, EB is a key part of our mental mechanism for predictive modeling and pattern-building. We are very powerfully hardwired to gather clues from our immediate environment from which a build a framework for what happens next.

Sensory deprivation has profoundly disorienting effects - particularly time sense - because it removes the markers needed to build perceptive frameworks. Similarly, being fed false or uncertain cues, or being deprived of them altogether (the crucial 'blind' part of a perception test) may impair construction of a perceptive framework: the ability to identify and model characteristic differences between unknown stimuli.

Removing expectation bias throws out the baby with the bath-water.

Beyond doubt, though, is that creating this environment is a major shift in mental state of the subject: there is precedent for such 'panic response' modes accelerating (adrenal reflexes) and repressing (quiz show contestant) brain function, but the overwhelmingly homogenised results generated by DBT point strongly to the latter. Perhaps there is useful research on this somewhere: I'm not aware of it.

This site has TOS #8 in place to keep the signal to noise ratio high. As has been aptly pointed out, sighted tests provide absolutely no guarantee of reliability, whereas positive double-blind tests do. The only concern on the table that could possibly have any validity (I am being generous) is that double-blind testing might reduce the sensitivity of the person taking the test. This is perfectly fine since a failed DBT is not used as universal proof that two things must sound the same which is generally where those arguing on behalf of placebophiles get tripped up.

Positive DBT is inherently cast-iron. The problem is that negative results equally indict the efficacy of the method, and that DBT perception tests are anathema: they generate results with poor resolution: they conform suspiciously well to the 'bad test' model: ie, they generate positives for gross phenomena but fail to recognise fine-grained distinctions. Wrong sieve size is a plausible diagnosis. Given that the test is misappropriated from a different domain and therefore - by definition - crudely tampers with its objective, this isn't surprising.

FWIW, as a professional tester I can tell you that I actually pay closer attention to detail when I am consciously involved in a test, despite DBT skeptics and snake oil salesmen telling me that I can't or don't.

That's exactly the point: test conditions create an environment in which you have to 'pay closer attention' - in reality, listen in an entirely different way, disorientated and deprived of cues. For a psych test, that's inadmissable.

Again, the purpose of DBT is to remove subjectivity as a factor. It can't legitimately be applied with any degree of precision to a study of subjectivity. Negative DBT results in the physiological domain are always open to question, but in this domain they aren't even interesting, and it's an embarrassment to the cause to see such faith placed in them.

You can of course link to reliable scientific metastudies documenting that 'invariably' and that the differences that disappear, are for real? Not that the effect is proven every now and then in certain setups, but that it 'invariably' is so?

And still this is no excuse for uncontrolled tests. Under the (hardly controversial) hypothesis that test setups can manipulate the test subjects in a manner crucially affecting the results, one takes measures to test ceteris paribus; that is, at least until one can establish that a certain manipulation introduced to the test setup is more likely to get you true answers (not only cocksure answers) and can be administered reliably. Good if we can find one (and in certain setups one can isolate the effect of this and correct for it, which is unlikely to happen if a marketing guy is about to convince you).

It would have been interesting to learn from that experiment the ratio of self-esteem depression to acuity. Then again, how is that measured? Beau Lotto is a frequent speaker on TED: his Public Perception project is worth a look.

This forum is a good repository of DBT acuity depression 'invariability' - or, by another interpretation of the same results - The Truth. However, the latter interpretation places an uncomfortably unquestioning faith in the application of the method.

Look, the scientifical conventions are themselves biased towards the null hypothesis: one has accepted standards which by themselves fail to accept lots of true answers, this because one does not want to accept false answers. That means that a lot of facts will have the status as unconfirmed hypotheses for a long time. If the only test result that could possibly indicate a difference is prone to indicate a false difference stemming from mumbojumbo marketing, then one should not accept it as evidence. Too bad if the difference is for real, but that's the cost of not being gullible. (Too bad if the millions I was offered from Nigeria yesterday were for real too, but heck ...)

'Null' definition depends on the intent of the test. It's perfectly proper that we should rail against commercial exploitation: if you tell someone an amplifier costs £5000, they will likely believe it sounds better than a £500 one, even if the labels are swapped. But there's a disturbing lack of rigour that muddies the waters when the intent of widespread homebrew tests is retaliatory, not exploratory. It's ironic that negative DBT results are spun with evangelical zeal as misleading as any manufacturer's hype - particularly when based on a spurious borrowing.

Fundamentalists of either stripe cling to mechanical measurements and DBT negatives on one side of the fence, and the primacy of experience on the other. Personally, I'm reluctant to come down on one side or the other, because of fundamental conflicts and flaws in the position of both camps. But I do think we should be open-eyed and honest about it: this is one time being blind doesn't help.

The only people I know of to shun DBT method of testing audio equipment are those that live off them in some way (editors/writers of hifi magazines, hifi salesmen) and people who themselves believe in their superiority over the common plebs in terms of hearing. The first kind knows that utilizing DBT in their reviews would result in sales plummet, and the loss of income through payed reviews and advertising. Second ones are often technologically disabled, and are more prone to explain things to themselves (and, more dangerously, to others) through pure magic and rituals, than to actually learn what is going on, because the bubble they live in would burst.Now, you say DBT testing would somehow influence the listener and he wouldn't hear the difference because of reasons. Bear in mind that these people often claim the "sky-earth" difference between two DACs, for example, so I hardly believe that testing that difference over the course of time (a month, a year) would involve any stress, and that they wouldn't hear even the tiniest difference, if it really exists.That argument is so invalid - if you are so easily affected by switching buttons form A to B, X to Y, than I am sure that every listening to the same song is a new experience, and it sounds different altogether and that difference, either is there or is not, that difference does not exist only when we are casually listening to music. Humans can't telepathically effect the bitstream in DACs or optical cables yet. It doesn't care what are you feeling, it just - streams and decodes, over and over again, every time you play the song.I really don't care how the ABX testing in medical research works - I'm not into medicine at all, and for the hydrogenaudio's sake, it shouldn't matter. Only thing that matters is audio ABX test, which serves to individuals to see if they really can hear difference between two codecs, or two DACs, if they have equipment to set this up. Individuals set up the testing environment as they prefer (I like drinking cocoa, for example), and the test is straightforward in the results - either you can hear the difference, or you can't. If you can't, that doesn't mean there is none, it just means that you can't hear it. Someone else might.So, why do you try so hard to convince us that ABX isn't valid method?

What I can perhaps maybe possibly gather from your posts is that blind perception experiments are crude. What I don't gather is how the negative results have any impact on the positive results.

Ideally you would stop rambling and be more concise, but at the very least, you should explain explicitly what it is that is faulty. You begin by questioning "the credibility of ABX from the physical domain to perception testing." If it is this broad challenge, then consider the fact that Signal Detection Theory and discrimination experiments, ABX being one of them, have widespread use in the speech perception literature. Are you suggesting that subjects performing well in spite of the lack of cues leads to flawed conclusions?

But the issue closer at hand seems to be the much narrower one, of the retaliatory use of negative ABX results as evidence that things sound the same. And yet you keep responding to comments about the scientific (in)validity of this with comments such as "The problem is that negative results equally indict the efficacy of the method" without explaining how this is so.

The only people I know of to shun DBT method of testing audio equipment are those that live off them in some way (editors/writers of hifi magazines, hifi salesmen) and people who themselves believe in their superiority over the common plebs in terms of hearing. The first kind knows that utilizing DBT in their reviews would result in sales plummet, and the loss of income through payed reviews and advertising. Second ones are often technologically disabled, and are more prone to explain things to themselves (and, more dangerously, to others) through pure magic and rituals, than to actually learn what is going on, because the bubble they live in would burst.Now, you say DBT testing would somehow influence the listener and he wouldn't hear the difference because of reasons. Bear in mind that these people often claim the "sky-earth" difference between two DACs, for example, so I hardly believe that testing that difference over the course of time (a month, a year) would involve any stress, and that they wouldn't hear even the tiniest difference, if it really exists.That argument is so invalid - if you are so easily affected by switching buttons form A to B, X to Y, than I am sure that every listening to the same song is a new experience, and it sounds different altogether and that difference, either is there or is not, that difference does not exist only when we are casually listening to music. Humans can't telepathically effect the bitstream in DACs or optical cables yet. It doesn't care what are you feeling, it just - streams and decodes, over and over again, every time you play the song.I really don't care how the ABX testing in medical research works - I'm not into medicine at all, and for the hydrogenaudio's sake, it shouldn't matter. Only thing that matters is audio ABX test, which serves to individuals to see if they really can hear difference between two codecs, or two DACs, if they have equipment to set this up. Individuals set up the testing environment as they prefer (I like drinking cocoa, for example), and the test is straightforward in the results - either you can hear the difference, or you can't. If you can't, that doesn't mean there is none, it just means that you can't hear it. Someone else might.So, why do you try so hard to convince us that ABX isn't valid method?

I think perhaps you underestimate the general level of the public's intelligence. Anyone who buys a piece of audio equipment knows that most visitors to their house will point out that this system sounds pretty similar to the last one they were excited about.

At the back of their mind, most buyers know that past a certain basic level of competence, expensive equipment is all counting the number of angels dancing on a pinhead. But people go see illusionists because the illusion is fun. People buy fancy boxes because there is pride of ownership - and - away from the hype, alone in their living room - for whatever reason - there is a absolutely real, fundamental sense of pleasure in music reproduction that may - or may not - derive from measured performance of the boxes. There is also the unshakeable fact that humans are status-driven, and that audio equipment is a status symbol, just like a car.

This may all be wrong, but it will persist. It is unaffected by our little erudite discussions about what ultimately can be measured or perceived.

I also know personally a number of editors, writers, salesmen and manufacturers: some do think they are superior to the plebs, but not usually because of their hearing: it's just their character. Similarly, I know a number of judges and car salesmen and postman who feel exactly the same way. It's also completely untrue that DBT for audio is unused in the audio industry: it's a standard tool for many makers and reviewers.

The sole, specific point I'm making is that DBT is rarely used in perception testing for obvious reasons outlined above, and attempting to smear its credibility from the physiological domain is intellectually dishonest. And that the abundance of negative results indicates coarse granularity in the test method as much as it supports any particular paradigm.

The sole, specific point I'm making is that DBT is rarely used in perception testing for obvious reasons outlined above, and attempting to smear its credibility from the physiological domain is intellectually dishonest. And that the abundance of negative results indicates coarse granularity in the test method as much as it supports any particular paradigm.

The only reason there is so much "failed" ABX tests is simple - people tend to believe in many magickal beings living in their amps and speakers and wires and headphones. But when put in front of magnifying glass, those little creatures tend to disappear. It's human fault in believing in ghosts rather than the fault of the testing method. And that's it.

What I can perhaps maybe possibly gather from your posts is that blind perception experiments are crude. What I don't gather is how the negative results have any impact on the positive results.

Ideally you would stop rambling and be more concise, but at the very least, you should explain explicitly what it is that is faulty. You begin by questioning "the credibility of ABX from the physical domain to perception testing." If it is this broad challenge, then consider the fact that Signal Detection Theory and discrimination experiments, ABX being one of them, have widespread use in the speech perception literature. Are you suggesting that subjects performing well in spite of the lack of cues leads to flawed conclusions?

But the issue closer at hand seems to be the much narrower one, of the retaliatory use of negative ABX results as evidence that things sound the same. And yet you keep responding to comments about the scientific (in)validity of this with comments such as "The problem is that negative results equally indict the efficacy of the method" without explaining how this is so.

Abstract:A positive DBT result establishes reliably that two outcomes or entities differ.A negative means - equally - either a) the two objects are identical, or b) that the method doesn't permit resolution of their differences.Inherent in DBT is that 'blindness' de-normalises test conditions: the mechanism by which expectation bias is removed represses acuity by removing cues from which perceptive frameworks are built. DBT is designed to separate subjective and objective responses and provide hard-to-falsify positive outcomes. Its suitability for physiological testing is not transferable to psychological testing but negative outcomes are aggressively touted as meaningful.

The only reason there is so much "failed" ABX tests is simple - people tend to believe in many magickal beings living in their amps and speakers and wires and headphones. But when put in front of magnifying glass, those little creatures tend to disappear. It's human fault in believing in ghosts rather than the fault of the testing method. And that's it.

Sure, people believe in crazy things.

I'm more interested in what happens when subjects are put in front of a magnifying glass: whatever disappears from their mind in those conditions is just as interesting as whatever does or doesn't 'appear' (sonically) in the test room.

"A negative means - equally - either a) the two objects are identical, or b) that the method doesn't permit resolution of their differences."

Means equally to whom? You? "To be sure, statements have been made in the literature to the effect that human listeners simply cannot perceive certain auditory properties of speech sounds, and this has, of course, been grist for the psychophysical mill. Apart from dismissing such extreme claims..." (Rapp 1986)

"Its suitability for physiological testing is not transferable to psychological testing"You haven't established this at all.

"but negative outcomes are aggressively touted as meaningful."And when it's not so touted? Should we throw all positive experimental babies out with the bathwater because of overaggressive interpretation in other cases?

Positive DBT is inherently cast-iron. The problem is that negative results equally indict the efficacy of the method, and that DBT perception tests are anathema: they generate results with poor resolution: they conform suspiciously well to the 'bad test' model: ie, they generate positives for gross phenomena but fail to recognise fine-grained distinctions. Wrong sieve size is a plausible diagnosis. Given that the test is misappropriated from a different domain and therefore - by definition - crudely tampers with its objective, this isn't surprising.

With respect, this is demonstrably wrong. The core tests that probe the very limits of human hearing use blind testing, and deliver results that match predictions from the known physiology of the ear. To get these results takes careful training - people need to learn what to listen for before they can hear as well as the physiology would predict.

You've written many words, but like most blind testing bashing, it comes down to this: "when people are under test, they listen differently so we can't know what they really hear. If they don't know what they are listening to, they are even more stressed." What if they do know what they are listening to, like in most hi-fi magazine reviews? The people are still "under test", yet seem to hear just fine? Given that knowing what you are listening to is both the only differentiating variable, and a known feature that will give completely unreliable results, you are either wrong (that's my guess), or are correct and have just kicked audio into a "never possible to know" philosophical world.

Positive DBT is inherently cast-iron. The problem is that negative results equally indict the efficacy of the method, and that DBT perception tests are anathema: they generate results with poor resolution: they conform suspiciously well to the 'bad test' model: ie, they generate positives for gross phenomena but fail to recognise fine-grained distinctions. Wrong sieve size is a plausible diagnosis. Given that the test is misappropriated from a different domain and therefore - by definition - crudely tampers with its objective, this isn't surprising.

With respect, this is demonstrably wrong. The core tests that probe the very limits of human hearing use blind testing, and deliver results that match predictions from the known physiology of the ear. To get these results takes careful training - people need to learn what to listen for before they can hear as well as the physiology would predict.

You've written many words, but like most blind testing bashing, it comes down to this: "when people are under test, they listen differently so we can't know what they really hear. If they don't know what they are listening to, they are even more stressed." What if they do know what they are listening to, like in most hi-fi magazine reviews? The people are still "under test", yet seem to hear just fine? Given that knowing what you are listening to is both the only differentiating variable, and a known feature that will give completely unreliable results, you are either wrong (that's my guess), or are correct and have just kicked audio into a "never possible to know" philosophical world.

Cheers,David.

'Knowing what you are listening to' is the troublesome variable. The power of suggestion derives from pre-erecting a preliminary framework with given reference points. The mind obediently attaches incoming sense data to that superstructure because it's hard, slow work to build a model from scratch. Proprioception is another example of the mind slowly building consistent external reality models only via trial and error cycles (Oliver Sacks has a fine description of this in 'A Leg to Stand On').

Deprived of reference points, acuity suffers. Not badly enough to become deaf, obviously - but badly enough to diminish large variables to small ones, and make small ones vanish entirely - which isn't a bad one-line summary of DBT perception results: particularly with reference to hearing which - being driven by feebler mental horsepower - is more prone to suggestion (and more in need of supporting frameworks) than sight (hence McGurk).

Either this is wrong (as you say), or truly accurate testing of this type will come later, when we can directly, mechanically examine - and analyse - brain response without tampering with the subject's psychological state. Certainly not a 'never possible to know' scenario.

DBT is designed to separate subjective and objective responses and provide hard-to-falsify positive outcomes.

I don't know why you introduced new terms, namely subjective and objective at this point.

You seem to be confused. Blind testing is merely used to control the various possible influences in a test. Objectivity and subjectivity need not be relevant.

For example I can control influences on a test so that subjectivity is maximized and objectivity is minimized. Or not. Or the opposite.

The opposite of a blind test is thus a completely uncontrolled test in which I have no idea what is or is not influencing the outcome.

This seems to be such a poorly-conditioned situation that it begs the question, why are you wasting your time doing this? ;-)

In the audio world non-blind testing is commonly used to make the outcome of the alleged test strongly influenced by expectations or other needs such as the need for an anecdote for a review promoting a certain product.

Quote

Its suitability for physiological testing is not transferable to psychological testing but negative outcomes are aggressively touted as meaningful.

Again you seem to be confused. Do you think that audio testing is physiological, psychological, or technical? You seem to have excluded testing for technical purposes for some reason. This seems strange because audio listening tests are usually represented as being tests of audio products, not people.

This forum is a good repository of DBT acuity depression 'invariability' - or, by another interpretation of the same results - The Truth.

Ehm ... are you sure you don't confuse The Truth with Conclusions Inferred From Reliable Evidence?

Scene I:(1) You have an entertainment show, where you roll six dice in front of a large bunch of people.(2) You take a sneak peak and know they are 1-2-3-4-5-6. (3) I guess 1-2-3-4-5-6, and it turns out to be The Truth. (4) Journalist calls up and wants an interview with this clairvoyant guy.

Obviously, it need not be that I am clairvoyant. Journalist has no control over the experiment, and doesn't know how way out of the null hypothesis this is (for that one would in the very least need to know the number of people). Solution: do a controlled test.

Scene II: Journalist has no idea what a controlled test environment is, and put you in front of me, rolling the dice. (1) You roll six dice. (2) You take a sneak peak and know they are 1-2-3-4-5-6. (3) I am thinking really hard now. I guess ... 1? Your face lights up. I am meandering ... maybe another one, or maybe not, maybe a 2? Your face lights up. Et cetera. (4) Wow! News story!

Here the issue is not the randomness. The issue is that the test is not double-blinded, and your behaviour influences my answers.

Now suppose for the sake of the discussion that clairvoyance isn't impossible. Suppose, for the sake of the discussion, that I am indeed clairvoyant -- I only need to think very hard first. In that case, that is The Truth. But is there any evidence for it? Sure not.