You can't hear the difference, because it's so subtle that's not worth mentioning... It really doesn't affect anything important as for the resulting sound. And that's what's really important - under normal, usual listening conditions you don't hear any difference.

Whilst it's virtually against the conditions of use here to suggest that CDs aren't transparent, if it were a problem in a psyhchoacoustic codec that was "so subtle that {it}'s not worth mentioning... under normal, usual listening conditions you don't hear any difference" there would be 10 people trying to track the "problem" down and fix it!

I don't trust these kind of comparisons, that are not rigorously controlled. Could you give more details?

I'm sure I've tried to write a definitive report of my experience before, but I can't find it, so here it is...

It's the 109th AES conference in LA. I'm presenting a paper, but while I'm there, my tutor has "volunteered" me to help with an audio demonstration. As it turns out, they don't need much help, but I go along anyway because it's very interesting.

The people involved are David Chesky and Kevin Halverson. Other "well known" people involved were dCS, who provided the converters, and Studer, who were kind enough to align the tape recorder before use.

David Chesky was there for reasons I'll explain in a moment. Kevin Halverson was there because most of it was his equipment, and he was operating it. They both had a very down-to-earth attitude to the audio industry: "How do you finish up with a small fortune after making a Jazz record? Start with a large fortune!" "Of course our company doesn't matter in the industry - heck - Sony's budget for paper clips is bigger than our turnover!" They both made quality recordings and equipment for the sake of it, though I was interested to hear the David Chesky would rather read a book when he gets home than listen to music. David was more of an artist, Kevin more of an engineer. They were kind enough to explain everything that was there, and let me play with all the equipment.

There were two demos, one relevant to this discussion, the other I'll explain too because it's probably more interesting!

Demo 1: 6-channel surround sound.

The main demo was 6-channel surround sound. Most of us have stereo. Some of us have 5.1. David was proposing that 5.1 isn't ideal for music. Since DVD-A and SACD actually have 6 full bandwidth channels, he wanted to use them. He didn't see the point of using one for the centre channel when his recordings already had a solid centre image. He didn't see the point of a dedicated .1 channel when music listeners should have full range speakers anyway, and their amplifiers should handle the bass management when they didn't. Plus, he had a much better use for these two channels.

He started with stereo (+/-30 degree spaced speakers), and used the normal surrounds (+/-110 degree spaced speakers, but this spacing could ideally be something else (I forget what) and in practice could be virtually anything that would fit into your room, so long as they were symmetrical behind you). To these four speakers, he added another two at +/-55 degrees front. The idea is that, in many real music listening situations, you get echoes from this direction. Good concert halls have their first main reflection at around this angle. So do many rooms. Since the human ear judges distance partly on the basis of comparing direct and reflected sound, it's good to make this cue more accurate. These extra speakers were raised off the floor by several feet, the idea being that in any realistic situation the audience would prevent you hearing anything lower from this angle.

You can argue with the reasoning behind this - whatever the justification, I think David chose it because it "just works". He'd decoded his B-format ambisonic masters into a 6-channel configuration and played with the result. For playback, he was using a PC with Cool Edit Pro running in multitrack mode. The 6 channels were output as 3 stereo pairs, to a special 6-channel sound card that Kevin had prepared for the demo. All files were 24-bit 96kHz. The PC could just keep up. You could switch channels on and off using the solo and mute controls in CEP, and I even dropped down into edit view a few times to try some spectral analysis on the various channels.

So, what did it sound like? These recordings were mostly things that David had already released on CD and got excellent reviews for. So, when listening to the stereo version, you were listening to a 24/96 version of what was available on CD. And it sounded pretty good. If you added the 2 rear surround channels, you got a nice bit of echo as well. This is like 5.1, but without the centre and sub. It's quadraphonic, if you like. It sounds nicer than stereo. Nice enough to go out and buy more amps and speakers? Depends. Then you switch on the two extra front speakers - wow! I mean - really - wow! The front sound stage went from maybe 6 feet to 20 feet across - and the depth! Sometimes CD reviewers go on about great depth in CD recordings - I know what they mean, but really, that's nothing! It's a few feet - this was like 20 feet! "You could shoot an arrow into that sound stage" was how one visitor described it.

Some of the recordings sounded "artificial" - larger than life. This was just due to the miking - David was still experimenting. Others sounded so realistic it was breathtaking. You had to be in the sweet spot for the most magic effect, but wow! Imagine sitting close to a cinema screen. Now imagine that it's in perfect 3D - so you know you're in a movie theatre because it's still there at the edge of your vision, but all in front of you is a different place. That's a visual analogy of what it sounded like.

I tried some interesting experiments. Firstly, if you turn off the main stereo pair, there's no real sense of sound stage or image - it's just sounds like random echo. Secondly, if you turn off just the back surrounds, leaving the front 4 speakers on, it still sounds amazing. OK, there's no echo from the back anymore, but the front sound stand is still as huge. If you only had four channels, it would be much better to have four at the front rather than two front and two back with these recording. Strange, but true. Try selling that to Joe public.

The speakers were $3000 a pair (3 pairs required), and the amps were much more expensive. So I have no idea what this would sound like on equipment that I could afford! But in the demo, it was just fantastic. Stereo was pathetic by comparison. I don't own a surround system myself, partly because 5.1 just can't compare to what I’ve heard from 6.0.

Demo2: different sampling rates.

Most people visiting the demo room wanted to hear demo 1. But a few were very interested in demo 2. We had a studer 2-track studio machine, a master tape from a 1960s classical recording (apparently an audiophile-beloved rendition of Scheherazade), dCS pro A>D and D>A converters, and the same amps and speakers from the 6-channel demo, just running the front stereo pair.

Simply, you could change the source selector on the pre-amp to compare the analogue source directly with the source sent via the A>D and D>A converters. On the A>D converter you could chose any rate you wanted, and the D>A would oblige. You couldn't switch sample rates while monitoring because there was a glitch while the D>A caught up - so any digital version was always interspersed with the direct analogue feed by switching the pre-map over while the A>D rate was changed. It was also possible to change the bit depth, but we left it at 24-bits. DSD was also possible, apparently, but would mean changing the digital interconnection, which we didn't attempt.

So, you could compare analogue, 32,44.1,48,96,192 digital. You could have 88.2 and 176.4 too, but we didn't bother.

People usually couldn't hear a difference. They'd ask us to switch more quickly than was possible, saying it was impossible to hear a difference, because by the time we'd switched, the mood of the music had changed. They wanted to hear the loud cymbal crashing bit most, convinced they stood most chance of discerning a difference during this. But most failed. And I'll tell you the truth - I sat in the sweet spot, listened both sighted and blind and couldn't hear any difference.

The next day, while the demo was being run for the Nth time, I was at the back of the room talking with someone. Suddenly, I heard a difference as the source switched. I was surprised, having failed to hear a difference the previous day listening in the sweet spot. I listened as it switched again, and heard it switch back – ah ha, it must have just gone analogue / digital / analogue. I kept listening – I couldn’t hear the difference next time it switched.

I went to the middle/back of the room, and listened through the next demo. Without being told, I could pick out 44.1 and 48kHz. The difference was more obvious back from the sweet spot than in the sweet spot itself. More importantly, the difference wasn’t what I (or the other people who failed to hear it) had been listening for. It didn’t make any difference to the frequency response at all, or to the clarity of the high frequencies.

What 44.1kHz and 48kHz did do was to make the sound slightly less realistic, like the difference between a good and bad CD player. If the lower sampling rate had any defined “quality” it was a glassy kind of sound – I’d heard that word associated with CD before and thought it was complete rubbish – but now I actually heard the difference, I understood exactly what people had meant.

The change from 44.1 or 48kHz to analogue to 96kHz slightly increased the depth of the sound stage. I’d been listening to the amazing demo 1 for 2 days, so it was hardly an impressive difference, but it was still there.

If you’re counting, that’s only two blind detections – once when I wasn’t even listening, and again when I went back to the middle of the room to check – I confirmed which had been which with Kevin afterwards – “The next to last one was 40something, wasn’t it?”

You can say many things about this. You could say it was just luck, but I don’t think it was – I wasn’t even listening for the difference because (having listened the previous day) I didn’t think there was one to hear! You can say that I was hearing sonic deficiencies in the equipment. Well, maybe. That may be what the whole 44.1/96k debate is based on. All I can say is that, if there are sonic deficiencies in this equipment (I think the dCS boxes are around 5k each, and are used in many recording studios) then there isn’t much hope for the rest of us!

What you could say, with some justification, is that the “character” of 44.1 was more obvious outside the sweet spot, so maybe it’s not such a big issue. That’s probably true – except that maybe I was just listening for the wrong thing when I was “in the sweet spot”. Maybe I had to stop listening to the Hi-Fi, and start listening to the music and the performance to hear what was happening.

What is significant is that the 44.1kHz version wasn’t just different from the 96k and analogue version, it was [I[worse[/I]. As the analogue was the master, any difference would be bad news, but for it to be subjectively worse makes matters even, well, worse!

I was upset to think how much recorded music only exists as a 44.1kHz or 48kHz sampled digital master tape. I discussed the subjective imperfections (the improved depth and realism of the 96k version) with Kevin, and he agreed. He was surprised that I’d noticed it that day, but couldn’t even hear anything wrong with 32kHz the previous day! I asked him what he heard with 16-bit (we’d been using 24-bit all along) and DSD. He said 16-bit was even worse – it made the whole sound “grungy”, and that DSD sounded nice, but added it’s own signature. “You can tell when you’re playing DSD through this system – the rooms heats up ” he said – I looked at the huge amps, and could believe it.

One thing I should note: I didn’t think the analogue master was particularly good quality. It was a gorgeous recording, but it had obvious flaws – e.g. background noise, and some audible edits. Also, I didn’t hear any difference between analogue, 96k and 192k. I can’t explain why 44.1kHz and 48kHz sounded worse, but they did. No one responsible for the demo had any reason to rig the results, and I played with enough of the equipment to know that everything was above board and fair, even though some of the cables we used might not have met with audiophile approval.

Demo 3

In another room, they had a 1960s recording studio set-up, and they had several master tapes from the 1950s and 1960s. Some actually were the masters (Jackie Wilson live somewhere), others were direct dubs from them, including Elvis, from a 3-track master – not many people have heard an Elvis recording in the original 3-track stereo!

There were plenty of other demos around. The DSD 5.1 demo was terrible, but that was due to equipment and volume. Most other demos sounded very harsh and artificial compared to the three I’ve mentioned. Not because they were using CDs as sources, but because modern studio and mastering practice gives rise to dubious quality recordings, as we often discus here.

It seemed quite perverse that the nicest sounds at the AES were a 6-channel demonstration of recordings and equipment that you can’t even buy, and a tape from the early 1960s which still hasn’t been released in it’s original 3-track format.

This snippet has an only, non-ambiguous, interpretation, given that it doesn't contain any frequencies over 22050 Hz ...

It believe it does contain frequencies above 22050 Hz since I did not obtain it by properly downfiltering it (from e.g. 192 KHz). But instead by fiddling with Cool Edit in the 44.1 KHz domain.

Doh! we were doing so well! If it's sampled at 44.1kHz, it doesn't contain anything unique above 22.05kHz. Anything above this is a copy of what's below it. And anything below it that you intended to be above it, isn't!

Yes, exactly! I may have an explanation for this: Above 22.05 KHz there are no unique frequencies (as you pointed out earlier, these are mirrored and copied from below 22.05 KHz), but of course they are frequencies nonetheless. This way one can indeed represent frequencies above 22.05 KHz with a sampling frequency of 44.1 KHz. But only for the price of adding the respective (mirrored/copied) distortion frequencies below 22.05 KHz as well. And only if no filtering in the recording step is done (brickwall at 20 KHz).

Quote

Quote

I compared the Cool Edit upsampling algorithm with several of it´s filtering algorithms. I found: Doing the silence fiddling in the 192 KHz domain and then filtering out anything above 22050 Hz looks much better than doing that fiddling in the 44.1 KHz domain and then upsampling it.

There shouldn't be any difference. (...) But you should be able to get a very similar result in CEP with the right filter. Really!

Perhaps the "implicit" frequencies above 22.05 Khz (see above) help the Cool Edit filters here. But this is not relevant anyways.

Quote

P.S. CD "sound" (if there is one) is glassy, and less realistic than 24/96. This is what I heard with professional DCS convertors, comparing A>D>A at 44.1kHz and 96kHz (both 24-bits, so both are actually better than CD) with the original analogue signal. (...) I can quite believe record producers who use this equipment everyday (and who hear live music every day) when they say that, to them, the difference is significant.

One could argue that the problem here lies in the (perhaps analog part) of the A/D or D/A circuitry. And that the different sound may have nothing to do with the digital representation in principle.What I do remember is that some people complained that stradivaris do not reproduce too well. But unfortunately this all is "anecdotic" evidence, and therefore not really scientific evidence.

Quote

But almost all of the faults I hear with CDs at home are almost certainly the fault of bad mastering.

If I could have access to some 24/96 music in wav format, I could set up one of those tests.

I would downsample the 24/96 wave to 16/44.1 using "realistic" filtering + resampling to 44.1 KHz in CEP (that should be equivalent to the filtering of a good ADC), and convert it to 16 bit with dither.

Then, I would play this wave with one of my cards, using its converters running at 44.1 KHz, and record the result with my other card at 48 KHz 16 bit. Ideally it should be recorded at 24/96, but I just have 1 good non-resampling card at 44.1 KHz that is used for playback, and happens to be the same one that also supports 24/96. My other card is non-resampling just at 48 KHz, but is limited to 16/48, although I think has pretty good quality. I think the the fact that the recording is done at 16/48 instead of 24/96 wouldn't have much influence in the test, specially if no differences are found at last. But this point should be analyzed with detail to see what problems it could have, and improved if found necessary. At first, I think that this 16/48 recording would capture all the "nastieties" (aliasing, smearing, etc) of the 16/44.1 DAC, but would introduce some of the 16/48 ADC, although at higher frequencies.

Then, I would upsample to 24/96 this 16/48 record, and people with good 24/96 cards could perform some blind listening tests comparing the original vs. the this other wave that has been downsampled, played, recorded and upsampled.

At the PCABX site there are some avalable clips to perform tests of this kind ( http://64.41.69.21/technical/sample_rates/index.htm ). They are interesting and show that AFAIR there is no difference in practice, but the clips available are not very representative of real-word conditions. They are very short on duration (and typical fans of hig-res formats won't consider them very representative), and in fact no real-world DAC has been used in their generation, just software processing.

The best procedure for such test seems to be to use one and the same output equipment (soundcard) with one and the same output sampling frequency (96 or 192 KHz, 20-24 bit). And to do anything else only in the digital domain.

Otherwise there will always be analog components that may interfere and that may distort the result of such listening test. The output sampling rate must be the same for all tests (96 or 192 KHz). And the test signal/soundtrack would need to have been been sampled with true high quality equipment at 96 or 192 KHz (20-24 bit).

The goal should be to prove that 44.1/16 is not enough for all signals, and that there is at least one signal (soundtrack) where the 44.1/16 resolution fails.

Quote

At the PCABX site there are some avalable clips to perform tests of this kind (...) and in fact no real-world DAC has been used in their generation, just software processing.

@2Bdecided -- You mentioned in your post that you discussed Demo2 with Halverson, and that he agreed with the impressions you had of a glassy sound with 44.1 and 48 kHz, and of more realism and depth with higher sampling rates.

Did he offer any possible explanations for the effect, gear-based or otherwise? It was his equipment, after all, and he must have given it some thought; maybe even played with different gear to see if the effect was consistently reproducible?

Nice post, but it proves nothing, sad to say... This is something all audiophiles (including me ) suffer. It's about the feelings and direct comparisons. Under normal listening conditions, on very average equipment (between 1500-2500 Euro for CD player, amp, speakers and cables together) and when you have nothing to compare with, you must be definitely deeply satisfied with good made 16/44,1 CD.

But I think, too, it's important to chase the highest quality possible, but a lot depends on equipment you use... and the question is, what importance has this chase in the real life - and if it has any importance for Average Joe from the street. I am afraid not. People are very, but very satisfied with their clipped CDs and boomboxes So it is just the matter of snobs (including me...).

And you're right in one thing - I am deeply convinced the best audio engineers were in job between 1954 and 1965 (approx.). They played with mikes, they could perfectly record the space, their recordings have depth... One just can't believe what they could do with the relatively poor equipment. The recordings from that era are just unbelievable, marvellous, monstrous, incredible... I have a lot of them (many on vinyls, but they sound good also from good remastered CD) and they make me always joy when I listen to them!

I find your report interesting, I hadn't read about it before. But, as you may imagine, I see some problems with it.

First, the program material is not what I would consider "critical". I doubt that a 1960 analog recording will have much content over 20 KHz, if any. And this counts too for the differences you also perceived with the 48 KHz sampling rate, that allow up to nearly 24 KHz to be captured. Still, I could be wrong.

Then, what you talk is interesting, but is not more than anecdotal evidence. Who knows if you could have passed a double-blind, statistically sound listening test, even at the exact conditions you heard the differences.

Even if you passed it, it should be analyzed where the objective differences lie. We should analize if the differences are due to the particular setup you listened to, or just due to sampling rates. First, we could analize if there was some problem at the AD/DA chain. If not, we should analyze (measure) if the differences could be due to some "problems" with the converters at the show (I'm being bad here), or due to DCS converter implementation, or due to common converter implementation. For this, we could also try the high quality downsample/upsample process and compare.

I agree that this has no relevance in the "real world" of average Joes and boomboxes. But then, anything better than a reasonable 128kbps mp3 is irrelevant to 99% of people too.

If the audible differences are only present on equipment costing more than most people earn in a year, then maybe we can forget it. However, there's a 95%+ mark up on audio equipment compared to the cost of the raw components. This means some talented audio amatures could get close to this quality, given a lot of time, and maybe a month's wages to construct the equipment. So let's not talk about $20k audio systems as though they're unattainable. There are people who could get close enough with $1k, a soldering iron, and lots of patience. If this hobby is your passion, I think that may be worth it. As someone who might try it, I'd (selfishly) like some nice recordings to play on it, even if you lot can't hear the benefit!

If there's any truth in the theory that, price for price, a DAC can sound nicer decoding 96k than 44.1k, even though an infinitely expensive DAC can do both perfectly, then 96k seems a good idea.

However, I'm worried that it might work in quite a different way. What if it's like, say, mp3 artefacts. To start with, for some people, they're hard to hear. Many people can't hear them. But some people do learn to hear them. You start with the most extreme examples, and, once you've learnt the "sound" of mp3, it's much easier to identify - even from quite difficult examples. Once your ear/brain learns the pattern, you can find it much more easily. Look at how the some of people involved with codec testing and tweaking are some of the most sensitive - Garf, Dibrom etc. It's probably practice.

So, I'm suggesting that, when it's more common place to experience CD vs something else (better?), more people will get a chance to notice the difference. The more you get to compare, the more you learn how to recognise it - and once you've learnt well, then the problems might annoy you. However, until something better becomes common, you have about as much chance of realising that CD isn't good enough as you would of realising that 160kbps mp3 wasn't good enough in a world where there was NOTHING better. You have live music, but if the recording doesn't sound like live music - well - that's just your stereo.

I'm not saying I actually think either of these things are true. I'm suggesting them as possibilities and speculation - time will tell, and I think I agree with your assessment. Whatever - stereo > multi-channel is a much greater leap than 44.1k>96k. If we can do both, great. If the multi-channel system is a version of ambisonics rather than 5.1 based, so much the better!

Quote

And you're right in one thing - I am deeply convinced the best audio engineers were in job between 1954 and 1965 (approx.). They played with mikes, they could perfectly record the space, their recordings have depth... One just can't believe what they could do with the relatively poor equipment. The recordings from that era are just unbelievable, marvellous, monstrous, incredible... I have a lot of them (many on vinyls, but they sound good also from good remastered CD) and they make me always joy when I listen to them!

Totally totally agree! Some of the early stereo recordings are so good, it makes you think "how did they do that?!" and why can't we do as well today!

You've reminded me: at the same AES, there was a workshop session where various producers talked about work that they'd done, and played examples. One had brought in the theme tune from Austin Powers - the original version (it's been remixed on subsequent films, and the track itself is around 40 years old), which had been recorded on a 3 track recorder. Brilliant! You could see people in the audience enjoying it. "How clever of you to anticipate the moves of Austin powers 40 years ago!" joked the interviewer. Another producer brought some Bolton song from a Disney movie - he'd chosen to bring it because it was the first production he'd worked on where he needed more than 100 tracks. It sucked! Not just because it was Michael Bolton singing Disney - but the recording was just so artificial and flat and BORING! No one was in the least bit interested. (Maybe i'm exagerating, but the poor comparison with the 1960s 3 track recording wsa quite embarasing!)

Now, you could say that it was the songs themselves that made a difference - true, they certainly did. But there was a magic in the actual recording from the 1960s that was completely absent from the others. I'm sure it was the primative technology, and the comparative simplicity of the mixing process that made the recording distinctive, good, and enjoyable.

The best procedure for such test seems to be to use one and the same output equipment (soundcard) with one and the same output sampling frequency (96 or 192 KHz, 20-24 bit). And to do anything else only in the digital domain.

That's how PCABX test clips have been generated.

Quote

Otherwise there will always be analog components that may interfere and that may distort the result of such listening test.

My suggestion would include actual DAC implementation on the test, but I guess that 24/96 recording would be very advisable.

@2Bdecided -- You mentioned in your post that you discussed Demo2 with Halverson, and that he agreed with the impressions you had of a glassy sound with 44.1 and 48 kHz, and of more realism and depth with higher sampling rates.

Did he offer any possible explanations for the effect, gear-based or otherwise? It was his equipment, after all, and he must have given it some thought; maybe even played with different gear to see if the effect was consistently reproducible?

He seemed to accept it as a fact of life at 44.1kHz. He'd obviously used other convertors, but that didn't seem the issue.

As for why - I can't remember who said it, but most of the ideas that were mentioned are now discussed on the dCS website, or in papers written by my old tutor (Malcolm Hawksford - search for his AES conference papers if you can). Energy dispersion. Non linearities in the equipment and even in the air. The Japanese (?) paper showing the change of blood flow in the brain when ultra-sonic sounds were present was shown first at that AES (without a translation!), and none of us could follow it! There was a general feeling that, well, maybe it's something to do with something we don't know about human hearing (it is so non-linear), but it's much more likely that it's some engineering issue which is explanable with real science if only you track down all factors.

When you start to think about modelling everything in the signal chain carefully, it's quite conceivable that with any real-world equipment (no matter how expensive) there could be a just perceptible difference between 44.1kHz sampled material and 96kHz sampled material. What it's not possible to explain is why the latter sounds better.

I find your report interesting, I hadn't read about it before. But, as you may imagine, I see some problems with it.

Yep!

Quote

First, the program material is not what I would consider "critical". I doubt that a 1960 analog recording will have much content over 20 KHz, if any. And this counts too for the differences you also perceived with the 48 KHz sampling rate, that allow up to nearly 24 KHz to be captured. Still, I could be wrong.

It wasn't critical. Well, depends what you're testing, but anyway - I agree. However, don't under estimate the frequency response of a good 2-track studio tape deck from the late 1960s.

Quote

Then, what you talk is interesting, but is not more than anecdotal evidence. Who knows if you could have passed a double-blind, statistically sound listening test, even at the exact conditions you heard the differences.

I totally agree - it's just anecdotal. By the way that I didn't hear it when I was concentrating, but did hear it when I wasn't, it would have to be a well designed test. Certainly worth doing.

Quote

Even if you passed it, it should be analyzed where the objective differences lie. We should analize if the differences are due to the particular setup you listened to, or just due to sampling rates. First, we could analize if there was some problem at the AD/DA chain. If not, we should analyze (measure) if the differences could be due to some "problems" with the converters at the show (I'm being bad here), or due to DCS converter implementation, or due to common converter implementation. For this, we could also try the high quality downsample/upsample process and compare.

Whilst that is all interesting and vital for scientific research, it's irrelevant to knowing whether 96k is better or not. Let's assume I could pass a blind test. Let's assume some of the other people who claim to hear a difference (they're not all using dCS convertors!) also pass a blind test. "Passing the test" means detecting 44.1k vs analogue, but failing to detect 96k vs analogue. This actually clears up most of your points.

Whilst, as a scientist, I want to know why... as a listener, I just want to listen to the better system.

As a scientist, "the high quality downsample/upsample process" is the most interesting one to try.

As for why - I can't remember who said it, but most of the ideas that were mentioned are now discussed on the dCS website, or in papers written by my old tutor (Malcolm Hawksford - search for his AES conference papers if you can). Energy dispersion. Non linearities in the equipment and even in the air. The Japanese (?) paper showing the change of blood flow in the brain when ultra-sonic sounds were present was shown first at that AES (without a translation!), and none of us could follow it! There was a general feeling that, well, maybe it's something to do with something we don't know about human hearing (it is so non-linear), but it's much more likely that it's some engineering issue which is explanable with real science if only you track down all factors.

The Oohashi japanese paper is the only thing I know of that offers some serious evicence of ultrasonic high frequencies being perceived in some way. Still, there are also some issues with it:

For one, the test signal had extremely rich ultrasonic content, and the whole system (including mic + speakers) used was flat up to 50 KHz, IIRC. All this is quite uncommon both at real-world recording studios and at real-world listening setups.

Also, I think the paper lacks some details over the experiment, such as some detailed measurements of the signals at the listening location, intermodulation figures of the test chain, nº of trials, etc, that would add more solidity to the results.

Then, this paper was peer-reviewed, approved and published at a neuropsychiatry journal, but a very similar one from the same authors was presented several years ago at the JAES, and AFAIK never passed the peer-review stage. Even when the paper has been published at a respected medical journal, I wait for anyone that can duplicate the results independently, nobody has done so so far AFAIK.

Now, I'll try to take a look at the DCS website and the papers available at there. I think that there is a possibility that nonlinearities in equipment/air/ear could account for a true perceivable difference. Then, and taking into account all possibilities, the issue would be which perception is more accurate to the real-world experience: the one caused from the signal lowpassed around 20 KHz, or the one that is NOT lowpassed? There would be some further discussion here.

Whilst that is all interesting and vital for scientific research, it's irrelevant to knowing whether 96k is better or not. Let's assume I could pass a blind test. Let's assume some of the other people who claim to hear a difference (they're not all using dCS convertors!) also pass a blind test. "Passing the test" means detecting 44.1k vs analogue, but failing to detect 96k vs analogue. This actually clears up most of your points.

Yes, this is what I was implicitly trying to say too: one should know if the audible differences were due to DCS converters at the show, due to "standard" DCS converters, or common to all or most converters available. In this case, the use of 96 KHz would be an easy solution for a common problem. There would remain the issue of if using better 44.1 KHz converters the difference would remain, and in case that not, how good would the converters need to be.

Then, and taking into account all possibilities, the issue would be which perception is more accurate to the real-world experience: the one caused from the signal lowpassed around 20 KHz, or the one that is lowpassed?

That's the really strange thing - with all these "non linear" explanations, it should be better to remove the ultrasonic stuff and give the equipment an easier time; whereas the opposite appears to be true in practice.

FWIW the effect of huge amounts of ultrasonic noise on equipment are well known. It's often described as an "oily" sound. My guess is that the high frequencies are linearising the equipment in a similar manner to dither linearising a DAC. e.g. consider a graph of an audio amplifier's response - voltage in vs voltage out. Obviously the output voltage should always be a constant multiplied by the input voltage. It should be a straight y=kx graph, up to the point where the thing distorts at a high voltage, where for any increase in the input, the output stays stuck at some value.

Now, consider a non ideal amplifier, where the input/output characteristic isn't quite a straight line. Maybe there's a small kink in the graph around y=x=0. (i.e. the middle, the origin). In a normal system, very quiet signals are always going to hit this kink, because it happens around very small voltages. However, add a huge amount of ultrasonic noise, and any quiet audible signal can be pushed onto any part of the input/output curve, and to different parts moment by moment. Before, the kink added a predictable error to the input (we call this distortion) - now, it's having an almost random effect. Adding something random is basically adding noise. Because the process is radonmised by ultrasonic noise, the additional noise added by the "kink" will also be mainly ultrasonic. It's like magic - with enough ultrasonic noise, you can make the system perfect... If the system can handle the ultra sonic noise that is!

This is only my theory as to why ultrasonic noise makes audio equpiment sound nice and "oily". Whatever, much of the "nice" sound of SACD is, supposedly, quite predictable from this nice oily sound which you can get by adding huge amounts of random ultrasonic energy to any source.

With 24-bit there's enough thermal noise in the electronics that surely it's effectively dithered heavily in the ADC whether you want it to be or not! And remember this was coming from a studio quality analogue source medium then being digitized.

Now, consider a non ideal amplifier, where the input/output characteristic isn't quite a straight line. Maybe there's a small kink in the graph around y=x=0. (i.e. the middle, the origin). In a normal system, very quiet signals are always going to hit this kink, because it happens around very small voltages. However, add a huge amount of ultrasonic noise, and any quiet audible signal can be pushed onto any part of the input/output curve, and to different parts moment by moment.

There is a naturally a kink around x=y=0. Most amps have a DC biased so this doesn't happenat a zero input signal, but at some point on the curve where it is loud enough to cover up the kink.A "class A" amp is defined as biased so you never hit that zero crossing.

Your idea of the ultrasonic bias is pretty much what exactly what the bias on a tape recorder does,though the effect they are quashing with the bias is I think hysteresis.

Totally totally agree! Some of the early stereo recordings are so good, it makes you think "how did they do that?!" and why can't we do as well today!

Now, you could say that it was the songs themselves that made a difference - true, they certainly did. But there was a magic in the actual recording from the 1960s that was completely absent from the others. I'm sure it was the primative technology, and the comparative simplicity of the mixing process that made the recording distinctive, good, and enjoyable.

I think, that partial explanation could be: because we have a lot of better technology, we lack the interest how to get really the best possible sound... we rely too much on machines and effects and lose our own creative potential, because it sounds great almost without any effort... And remember, then there was almost no post-production. The songs were recorded in one take, that's the reason why you can see on editions that are released nowadays various take-numbers. And the albums were made mostly in one or two days!

I know, but those are not what I'd call very detailed measurements. Also, if you compare carefully the FRS and the HCS spectrums (I did it cutting them from a paper print and superimposing them over a window at home), you'll see that they are not identical, there are minor but obvious differences at some parts of the spectrum.

For that reason, a more complete set of measurements and objective analysis of the process would have been necessary, in order to discard possible real audible diferences just at the audible band.