44.1 vs. 88.2 revisited, but with a twist

I am going to do a simple test recording of direct acoustic guitar, at 44.1 and 88.2Khz rates, to compare for my own knowledge. My question is, if there are differences to be heard, what exactly should I be listening for, if I want to hear the differences?

I don't think you should have anyone tell you what to listen for because that will just cause bias.
You need to be very careful what source you use and make sure everything is exactly the same except for the sample rate.
The whole idea is to have an unbiased environment and it needs to be a blind test...preferably multiple listening of the two tracks so its not just a 50/50 chance test..it would be best to have someone else do the actual recordings or setup the playback scheme...I would be totally surprised if you are not already biased toward the 88.2 rate as being better because there are so many claims already out there to that effect....supposedly!? So there's also that!....
If you are going to record an acoustic guitar you will have to record it once onto something like a master source recorder (analog tape perhaps) and then use that single source as the test sample to create the two different sample rate digital recordings...if you don't do that you will be listening to two different performances with subtle differences especially if you move at all in front of a microphone...you will never be able to duplicate the same sound live exactly the same....Good luck and post your blind samples

I've done a version of this type of test, but using a single microphone and single pre-amp per channel. Some tracks were a set of independent microphones and some were stereo microphones. The output of each pre-amp output split to feed two separate Alesis HD24XRs running independently, one at 44.1KHz and one at 96KHz. That was the way I could ensure that the same signal at the same amplitudes fed the two different recordings. I did several takes with different microphones and source material, alternating which HD24XR was running at 44.1KHz and which at 96KHz to average out any slight differences between the recorders. I began each take with the input switched briefly to a 1KHz sinewave tone for phase alignment purposes. I also took the precaution of using an external clock source for whichever HD24XR was running at 44.1KHz, as the Alesis internal clock, although stable, is not exact at that setting.

When listening to the results, it was usually easy to pick out the 96KHz recordings, even when the microphone in use for that test was a ribbon whose response started to tail off steeply at 15KHz. The extra "air" in the 96KHz recordings gave a sound much more like listening to a live studio feed via analog equipment. I've posted on this before, but my firm belief is that the significant difference is the setting of the digital anti-aliaising filters in the A-D converters. Despite the linear-phase nature of these filters, the effect on transients in the sound is heard more distinctly than a simple frequency-response figure would have us believe.

As I trial, I played a few of the 44.1KHz tracks as an analog signal and re-recorded them at 96KHz on the other recorder. I then transferred both the original 96KHz recordings of that take and the re-recorded ones into a DAW and tried subtracting one track from the other in an attempt to see the differences. Despite using the test tone to make adjustments to the time alignment and amplitude of the tracks before subtraction, it was difficult to get results from the live tracks that when scaled up would represent just the high frequency differences. Although this was not an exhaustive test, it would reinforce the belief that there are several subtle effects due to the difference between digital 44.1 vs. 96 KHz recordings, and they are not restricted to the audio above 20KHz.

Note that these tests were all done at the 24-bit level, and could represent effects that are lost when transferred to 16-bit CDs. However, I have mentioned before that some "direct-to-disk" recordings indicate that there can be other digital processing effects in conventional studio recording and mixing that are audible even after transfer to 44.1KHz/16-bit CDs.

I also took the precaution of using an external clock source for whichever HD24XR was running at 44.1KHz, as the Alesis internal clock, although stable, is not exact at that setting.

Click to expand...

The difference in sound could be attributed to the Alesis failing to syncronize with the external clock properly.

There was an interesting SOS article recently, which tested various bits of gear with internal and external clocks: their conclusion was that none of the equipment performed better when clocked externally: with the high end gear external clocking made no difference, while budget gear showed worse performance with the external clock than with their own internal clock.

As you can see these tests are not trivial to design. I recorded analog outputs of digital samples at both sample rates. Not as good as Boswell's test, but the best I could manage at the time. Do the best you can, but recognize that it will be hard to eliminate bias. In addition to listening to the recordings at 44.1 and 88.2 I'd encourage you dither them down to what every you consider the final listening environment - 16/44.1 for me. This will allow you to easily scramble the mixes and order them randomly. I never figured an easy enough way to do a reasonably blind comparison of the raw takes. Still, I felt I could hear better high frequency and transient response in the 88.2 takes. The difference was much less in the dithered output. I found myself listening for artifacts rather than overall quality to decide which was which.

In the end, I've decided to record songs with high track counts at 44.1, low counts at 88.2. YMMV.

I think your test may be flawed: The difference in sound could be attributed to the Alesis failing to syncronize with the external clock properly.

There was an interesting SOS article recently, which tested various bits of gear with internal and external clocks: their conclusion was that none of the equipment performed better when clocked externally: with the high end gear external clocking made no difference, while budget gear showed worse performance with the external clock than with their own internal clock.

Click to expand...

The need to clock HD24s using external clock at 44.1/88.2 KHz has been well documented on this forum and others, as the HD24 internal clock is slightly inaccurate at these rates. The HD24s synchronize well to an external clock, and this particular effect has nothing to do with clock quality, stability or jitter, it's simply a matter of accuracy. I saw the SOS article, and I would put the HD24XR in your category of "high-end gear", where there is no difference in sonic quality through the use of a (good) external clock over the internal clock.

It was some years ago I did these tests, but I used an external clock for the 44.1KHz rates to avoid the effects I had previously had of pedants pointing out the inaccuracy of the internal clock and implying that sonic quality issues were due to that. In fact, the matter is unimportant if recording and replaying is entirely in analog or entirely in digital, but matters where one process is analog and the other digital.

I used an external clock for the 44.1KHz rates to avoid the effects I had previously had of pedants pointing out the inaccuracy of the internal clock and implying that sonic quality issues were due to that.

Click to expand...

So, now you have pedants pointing out that the clocking might account for the difference instead. :wink:

So, when you say inaccurate; you mean the clock is stable and jitter-free, but doesn't run at the stated speed? Was it slower or faster, and by how much?

So, when you say inaccurate; you mean the clock is stable and jitter-free, but doesn't run at the stated speed? Was it slower or faster, and by how much?

Click to expand...

Yes, that's right, although the concept of "jitter-free" is relative. I don't have my technical files with me at the moment, but my memory is that the 54 MHz internal crystal is divided down by 1125 to get 48.0000 KHz, or by 1224 to get 44.11765 KHz. i.e. about 0.04% fast at that lower rate. While it's necessary to divide down to these frequencies to meet the requirements of the ADAT lightpipe frame rates, the ADCs and DACs are actually clocked at over-sampled rates in the MHz via a phase-locked loop multiplier, which is also used for multiplying up the external clock input when that is selected.

I've done a version of this type of test, but using a single microphone and single pre-amp per channel...

...Note that these tests were all done at the 24-bit level, and could represent effects that are lost when transferred to 16-bit CDs. However, I have mentioned before that some "direct-to-disk" recordings indicate that there can be other digital processing effects in conventional studio recording and mixing that are audible even after transfer to 44.1KHz/16-bit CDs.

Click to expand...

Interesting replies all around. A few things jump out at me. Here are two for the time being.

One is that it is beyond my scope to actually do this test (at this time) with the kind of precision that is being prescribed. The other is that the second paragraph of the above quote is of supreme importance, and I am confused why you would go to all of the trouble to A/B test sample rates, and not perform the next step to see if it carries over after dithering. If it does, then higher sample rates matter. If it does not, then using higher sample rates is superfluous. I do see that you have some indication from other procedures that lead you to conclude that some effects carry over, but why did you not go the whole nine yards with your more complex and rigorous testing that you described?

I've also run tests at high sample rates and had a similar experience as Boswell....
Mine was RME FF800 at 192 and there was an airy feeling to the higher rate recording.
So there is definitely something to tracking at high rates...I don't know if that means you should use 88 or 96 or even 192. My gut feeling is 192 is the place to be but I think that all depends on what your trying to accomplish.
I don't buy the HDD space issue anymore because systems have such good throughput now....there's nothing stopping anyone from using eSATA or SSD for recording drives and if need be move the recorded track data after the fact to a storage medium for mixing, processing etc....
The end result is how much detail can be captured and is this necessary...in my mind no need to worry about this if it's rap, spoken word, hip hop or even processed pop..(not trying to slam any genre here)...now acoustic classical guitar or something like that with serious dynamic range and articulation that might "sound" open or live or airy maybe worth the effort....but it seems to be part of the whole argument...whats the point if your content doesn't warrant it...some of thie stuff out there sounds fine with ipods and earbuds.....what's more interesting is this clock argument being related to all this.....come on were talking about digital sync here....it's either on or off...sync or no sync....there is no other choice....clock accuracy and jitter is measured in terms of picoseconds...a little drift here and little drift there....really?....really?....seriously?
I can see this turning into a real argument now!

...I am confused why you would go to all of the trouble to A/B test sample rates, and not perform the next step to see if it carries over after dithering. If it does, then higher sample rates matter. If it does not, then using higher sample rates is superfluous. I do see that you have some indication from other procedures that lead you to conclude that some effects carry over, but why did you not go the whole nine yards with your more complex and rigorous testing that you described?

Click to expand...

I take your point, but that was not what I was investigating at the time. The HD24XR is inherently a 24-bit recorder, and I decided to do the sampling rate tests because I was in negotiation then with a client about whether a particular live event should be recorded at 44.1KHz or 96KHz. I was the only recording engineer bidding 96KHz for this job, and I wanted to show that it made a difference, other things being equal. If the 44.1KHz test recordings were deemed acceptable, he would go with that rate.

On listening to the tests, he got the point that (unsurprisingly) the 96KHz recordings sounded better, so that's what we used. The post mixing was done on an analog large-format studio console, with the 2-track mix digitized at 44.1KHz/24-bit. I didn't do the mixdown, but the result had the qualities associated with higher-rate digital recordings of good "air" and "presence". I understand the 96KHz/24-bit originals are archived for possible future release in a higher-rate medium.

The surprise may be that many recording and mixing engineers often do go to this sort of trouble when the designated distribution medium is a 44.1KHz/16-bit CD. Call us old-fashioned.

As I understand it there are basically two schools of thought concerning the benefits of recording at higher sample rates.

1. Frequencies above 20KHz are significant, even though we don't hear them directly, so preserving them using higher samplerates improves the sound audibly.

2. Frequencies above 20KHz are not significant, but the gentler anti-aliasing filters needed at higher SRs do less damage to the significant frequencies below 20KHz.

It seems to me that if version one is correct, then in Boswell's scenario above all the advantages of recording at higher sample rates would have been lost when printing the stereo mix at 44.1KHz

However, if version 2 is correct we might indeed hear an improvement, as the final mix would have passed through steep anti-aliasing filters once instead of twice.

My own guess is version 2. But I can't recreate the test because I mix primarily in the box using plug-ins: these will all sound better at higher samplerates!

In other words, if you mix ITB it doesn't really matter whether version 1 or 2 are correct as there should be a real improvement when working at higher samplerates either way, and this will definitely still be audible after downsampling to 44.1KHz (assuming a reasonably good quality resampling algo.)

That's probably a reasonable summary, but I would add the following points:

1. "Hearing tests" are what we use to say that we can't perceive tones above certain frequencies, e.g. 20KHz as a child, 15KHz as an adult. However, they are done using steady-state tones, and don't take into account the high-frequency components of many transients. For example, some older people who may have conventional hearing fall-off above 15 KHz can hear the difference between laboratory recordings of transient sounds that are the same except for frequencies in the 18-25KHz being filtered out in one set. One explanation of this is that a falling-off in hearing at a particular frequency is not a brick-wall effect, and so there will still be an attenuated response at these higher rates. Another explanation is that there is a different mechanism in the ear that deals with transient sounds, and this does not fall away with age as fast as the tests from steady-state tone measurements would have us believe. Both of these and possibly other processes could be happening at once. I'm not an audiologist, but I do have some belief in the second explanation.

(2) The sound on commercial CDs produced via conventional modern studio techniques is liable to have gone though a LOT of digital processing of various sorts. Each of the processes that the sound has to go through has the capability of adding a subtle flavour. I seem to be referring to this frequently these days, but the sound from a good "direct-to-disk" live CD can easily surpass that from a studio recording. It's a similar story with the 96KHz/24-bit recordings I mentioned earlier. Although the 44.1KHz/16-bit CD version from those tracks was audibly "less good" than the original masters, it was still a good CD, and arguably better in sound than if the original recordings had been at 44.1KHz and mixed digitally.

3. Note that no digital sampling-rate conversion was performed in the set of tests and recordings I mentioned. The 96KHz digital recordings were replayed then processed and mixed in analog before being re-sampled at 44.1KHz as though they were a pair of live stereo tracks. Digital sampling-rate conversion generates its own set of effects (another topic).

...In other words, if you mix ITB it doesn't really matter whether version 1 or 2 are correct as there should be a real improvement when working at higher samplerates either way, and this will definitely still be audible after downsampling to 44.1KHz (assuming a reasonably good quality resampling algo.)

Click to expand...

Could you expand on this? I think I see the point, but is it clear that improvements in plug-in performance will survive downsampling? Do you have specific types of plug-ins that you expect to perform better at higher rates?

Update: I did not do any processing in my tests. If I were going to test this - the effect of different sample rates on plug-ins - what plug-ins would you expect to have the most noticeable effect.?

The fact is, almost all digital processors introduce some degree of aliasing, where frequencies above nyquist get 'folded down' into the audio range and cause a nasty type of brittle 'digital sounding' distortion. Once this has happened there is no way to remove it, so the only way to cure aliasing is to prevent it happening in the first place.

A plug-in that implements internal oversampling will first up-sample the incoming audio, then process at a higher samplerate (hopefully high enough to shift the aliasing up above 20KHz) before filtering at nyquist and decimating back to the original rate.

If you run at a higher samplerate to begin with you are effectively oversampling the entire mix: plug-ins that don't implement internal oversampling will benefit from lower levels of aliasing in exactly the same way, while plug-ins that do use internal oversampling will now be downsampling to your higher rate, and will therefore need less draconian anti-alias filtering (or they might not need to oversample at all).

Do you have specific types of plug-ins that you expect to perform better at higher rates?

Update: I did not do any processing in my tests. If I were going to test this - the effect of different sample rates on plug-ins - what plug-ins would you expect to have the most noticeable effect.?

Click to expand...

The more aliasing the process generates the more it is likely to benefit. Distortion is the most obvious candidate, as this adds harmonics by definition. Amplitude modulation or ring modulation will benefit greatly. Compression or limiting can also cause aliasing: the faster the attack or release times the more likely this is to be audible (compression is just a form of amplitude modulation after all). Also EQ high frequency boosts.

However, I think the easiest way to hear the difference is with a virtual analog type synth, preferably a old and naive model.* Set up a single raw sawtooth wave with no filtering, and play up high: the difference should be pretty startling at higher samplerates.

* The FabFilter synths (for example) have scarily low aliasing levels even at 44.1KHz, so might not be the best choice for this test! Maybe try the synths bundled with your DAW, or download some freebies?

Great. That makes a lot of sense. Another point I had not thought about is that it makes sense to audition plug-ins at multiple sample rates to see if you can perceive artifacts by comparison. I rarely use digital synths or distortion devices, and I definitely know what aliasing of a saw tooth sounds like (that's the textbook example isn't it). But I definitely want to try this with my various compressor plugs to see if I can hear any differences.