16 July, 2010, 12:07:42 PM

P18-6 Sampling Rate Discrimination: 44.1 kHz vs. 88.2 kHz—Amandine Pras, Catherine Guastavino, McGill University - Montreal, Quebec, CanadaIt is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1 kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1 kHz and 88.2 kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1 kHz, 88.2 kHz, and the 88.2 kHz version down-sampled to 44.1 kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2 kHz and files recorded at 44.1 kHz. Convention Paper 8101

Interesting. The fact that the discrimination was mostly limited to the downsampled version would seem to indicate that any audibility issues are with the downsampling procedure rather than the sampling rate itself. There is, however, the case of the orchestral music...

Searching for the paper I also found that the same authors have published a test comparing CD-quality vs MP3:

Quote

Subjective Evaluation of MP3 Compression for Different Musical Genres

Mp3 compression is commonly used to reduce the size of digital music files but introduces a number of potentially audible artifacts, especially at low bitrates. We investigated whether listeners prefer CD quality to mp3 files at various bitrates (96 kb/s to 320 kb/s), and whether this preference is affected by musical genre. Thirteen trained listeners completed an A/B comparison task judging CD quality and compressed files. Listeners significantly preferred CD quality to mp3 files up to 192 kb/s for all musical genres. In addition, we observed a significant effect of expertise (sound engineers vs. musicians) and musical genres (electric v.s acoustic music).

Again, frustratingly little information given about the details of the test...

P18-6 Sampling Rate Discrimination: 44.1 kHz vs. 88.2 kHz—Amandine Pras, Catherine Guastavino, McGill University - Montreal, Quebec, CanadaIt is currently common practice for sound engineers to record digital music using high-resolution formats, and then down sample the files to 44.1 kHz for commercial release. This study aims at investigating whether listeners can perceive differences between musical files recorded at 44.1 kHz and 88.2 kHz with the same analog chain and type of AD-converter. Sixteen expert listeners were asked to compare 3 versions (44.1 kHz, 88.2 kHz, and the 88.2 kHz version down-sampled to 44.1 kHz) of 5 musical excerpts in a blind ABX task. Overall, participants were able to discriminate between files recorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2 kHz and files recorded at 44.1 kHz. Convention Paper 8101

Was anyone at the presentation? Has anyone bought the paper?

Some friends from here went to the convention and we discussed a number of the papers last Saturday.

Problems I see with the tests is that they were done rather awkwardly and with the DACs running at different sample rates. Recording with parallel ADCs running at different sample rates introduces more potential for artifacts.

It is easy to show that recording and playing at different sample rates sounds different if the equipment has artifacts that are associated with different sample rates, which is not uncommon.

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

No. Upsampling shouldn't have any impact on the sound, but would avoid any potential issues with the playback hardware.

Overall, participants were able to discriminate between files recorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestral excerpt, they were able to discriminate between files recorded at 88.2 kHz and files recorded at 44.1 kHz.

So this basically means it's more important to make the DAC high-resolution than to make the ADC high-resolution. Interesting. So pbelkner's question is legitimate. Maybe a 88.2-kHz DAC tends to reproduce a 44.1-kHz recording more faithfully than a 44.1-kHz DAC? Assuming a good upsampling algorithm, of course.

I guess I'll post a few of the details and observations here so that people don't need to go to the PDF to check them:

- Trained listeners were 9 sound engineers and 4 musicians of age ~28 (SD 5.6 yr)- LAME used, unknown version. 96, 128, 192, 256 and 320. Alas, no VBR it seems. Pity.- Genres: Pop, Metal/Rock, "Contemporary", Classical, Opera. <10sec excerpt.- HQ speaker setup (not headphones). Wonder if it would have made much difference? - 150 randomized trials (per excerpt? overall? not clear). Pairwise A/B. Testers asked to "prefer" one or the other and then the overall % tested for statistical significance.- For 256 and 320 preference was 50/50 (so not significant). For 192, 128 and 96 it was 60/40, 75/25 and 80/20, respectively (significant).- Sound engineers were more likely to prefer the higher quality version than musicians. Electric genres (pop/metal) were more frequently preferred in their HQ version than acoustic ones.- Order of problems cited in decreasing frequency were: high freq artefacts, general distortion, transient artefacts, stereo image, dynamic range, reverb, background noise.- No correlation between listening habits and performance.- In the conclusion it is stated that trained listeners can not discriminate between CD quality and mp3 compression at 256-320 kb/s, while expert listeners could. Not sure who these "expert listeners" are supposed to be, but probably the test subjects from a referenced paper (Sutherland 2007?) who reportedly could do so even at 320.

I suppose that the results aren't very surprising, really. If VBR had been used, I imagine the threshold bitrate would likely end up being around those obtained by V2, supporting its status as "best value" setting.

Yes, exactly. It is cheaper to design a 192kHz DAC than 44.1kHz DAC because the reconstruction filter can be a lower order. If you compared two theoretically perfect DACs, 192 vs 44.1, perfect upsampling would make no audible difference. But at the consumer-end of the market, a 192kHz DAC is likely to have better audio performance characteristics than a 44.1kHz DAC. How much impact this has in practice is fairly small, and will only decrease as technology becomes cheaper.

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

No. Upsampling shouldn't have any impact on the sound, but would avoid any potential issues with the playback hardware.

So this basically means it's more important to make the DAC high-resolution than to make the ADC high-resolution. Interesting. So pbelkner's question is legitimate. Maybe a 88.2-kHz DAC tends to reproduce a 44.1-kHz recording more faithfully than a 44.1-kHz DAC? Assuming a good upsampling algorithm, of course.

My preference is to up-sample 44.1 kHz to 88.2 kHz and 48 kHz to 96 kHz, respectively, depending on the original sample rate, i.e. no fixed rate for up-sampling but an integer multiple of the input sample rate.

1) I suggest this thread be focused on the 44.1 vs 88.2 paper. A separate thread can be about the MP3 study.

2) I bought the paper. Here's a paraphrase of the methods and results. Note that the test signals were recorded by the authors.

subjects : 15 male, 1 female, all having at least 3 yrs of sound engineering experience, six being pros, ten being students. All but one were musically trained.

equipment: the recording microphones (a pair of Sennheiser MKH 8020) had a FR of 10Hz-60kHz. Two stereo feeds from the mic preamp (Millennia HV-3D) to two Micstasy ADCs, one set to 44.1/24 the other to 88.2/24 ; then the 44.1/24 digital signal was recorded (at 44.1) on a Sound Devices 744T portable recorder, while the 88.2 output was recorded on a MacBook Pro at 88.2 using Logic Studio software. The recording diagram also shows that the 44.1 ADC used its internal clock, while the 88.2 ADC's master clock was a Mutec .

test signals: five musical/instrumental (orchestra, classical guitar, cymbals , voice, violin) recordings by the authors, from live performances at McGill that took place in several halls/rooms with varying dimensions & acoustics. For use in tests, 5-8 sec excerpts were used, with no processing except fade in.out via Pyramix 6 software. Care was taken to make the fades the same on the 44.1 and 88.2 examples. The 88.2 excerpts were also then downsampled to 44.1 via Pyramix software, so there were three sets of signals, native 44.1, native 88.2, and downsampled 88.2-->44.1.

playback: 5 blocks corresponding to the 5 excerpts, 12 trials per block ( i.e. all pairwise combinations of the three versions, each presented 4 times, twice in each of the two presentation orders) . Randomized ABX protocol was used. Listening occurred in an ITU standard room (the Critical Listening Lab of the CIRMMT, Montreal). Plaback hardware was an RME Fireface 800 DAC, a Grace m906 monitor controller, and a Classe CA-5200 stereo amp, feeding a pair of B&W 802D loudspeakers (FR 70Hz-33kHz). The authors picked the Fireface because 'it was the only converter that allowed us to switch sample rates between 44.1 and 88.2 in a respectable amount of time." To avoid clipping a 750ms switching interval was employed, set in the user interface which was Cycle '74's Max/MSP/Jitter software package. (I'm not quite clear from this how ABX switching was done, though I'm guessing the UI allows it?). All playback was at 24 bits.

Results: Considering cumulative binomial test results ( i.e., for all comparisons of all excerpts), 3/16 individuals achieved significant results (p < 0.05, 2-tailed ) but 'they significantly selected the wrong answer' (?!). The other 13 didn't perform better than chance, either individually or as a group. But for these 13 subjects as a group, IF one considers each format comparison separately (rather than combining all comparisons) significant results were observed for 88.2 vs downsampled (p = .04, 2-tailed) . For the same group a 'tendency' was observed for 88.2 vs native 44.1 (p = 0.1) . No significant diff between native 44.1 and downsampled 44.1 (p = .2, I guess this doesn't constitute a 'tendency'). Results for the 13 subjects (grouped) were also re-analysed by musical type. Significant: Orchestral excerpt for 88.2 vs native 44.1 (p=.02). Classical Guitar and Voice excerpts for 88.2 vs downsampled (p = .004, p= .04). Not significant for any excerpt: 44.1 vs downsampled 44.1

I also don't understand why a 192kHz DAC is supposedly cheaper to build. It is cheaper to build a good sounding 96kHz ADC than a 44.1kHz one, since the latter needs brickwall filtering. But oversampling DACs have quite relaxed filtering requirements already. So why should a 192kHz version be cheaper to build than a 44.1kHz version?

PS

Thank you, krabapple! The data itself shows not a proof but a 'tendency' that the summary is quite fishy... For anyone claiming that p=0.1 is a 'tendency' that could not be called a baseless offense, I guess.

I think the least incorrect way to compare 88.2 KHz and 44.1 Khz would be to record at a higher sample rate, e.g. 176.4 kHz, downsample to 88.2 kHz & 44.1 kHz, and then upsample both versions to a yet another sample rate before the listening test, for instance to 192 KHz. This would apply two sample rate conversions to each test sample.

The *right* way to do this experiment is to record everything at a high sample rate, and create the low sample rate data by down sampling followed by up sampling. Then, all the hardware runs at the same (high) sample rate and differences due to its failings are minimized.

Does this mean up-sampling CDs from 44.1 kHz to 88.2 kHz improves their sound, i.e. brings them closer to the sound one would achieve recording them at 88.2 kHz?

No. Upsampling shouldn't have any impact on the sound, but would avoid any potential issues with the playback hardware.

Agreed. Some audiophiles will start a controversy at this point because they believe that upsampling improves sound quality even though the music actuall remains the same, even though it is produced by more samples. Must equipment has been sold based on this belief. In fact if you do a proper job of upsampling, nothing changes in the end when you convert it back to audio.

There have been many improper jobs of resampling that have lead to misleading audible differences. However we have long (over a decade) had good resampling software such as Cool Edit Pro and Audition.

The types of issues I'm specifically trying to avoid are situations where there is some quirk in the playback hardware that is audible at one sample rate, but not the other.

If you are doing ABX that involves comparing two files that are at two different sample rates, switching between them may and has produced transients that are peculiar to one sample rate or the other. For example, if you are switching from A to X and X is B and has a different sample rate than A, the transition may sound different than switching from A to X where X is A and so the sample rate is the same.

Another example might be where the audio interface has slightly different response into a long cable at different sample rates. These sorts of things aren't intentional, don't show up on spec sheets, and may not show up in typical use, but once you set up an ABX testing environment, there you have it!

I think the least incorrect way to compare 88.2 KHz and 44.1 KHz would be to record at a higher sample rate, e.g. 176.4 kHz, down sample to 88.2 kHz & 44.1 kHz, and then up sample both versions to a yet another sample rate before the listening test, for instance to 192 KHz. This would apply two sample rate conversions to each test sample.

If that floats your boat then its not worth arguing about.

However good re sampling, particularly with 24 bit data, can and often is very transparent as long as the sample rates are high enough, e.g. >= 44.1 KHz.

There's no need to balance the number of re sampling steps within reason.

Using 96 KHz as your upper sampling frequency is just as good as 88 KHz and vice versa.

There are audiophile myths about re sampling, involving some perceived need for sample rates that are integer multiples of each other. With good re sampling hardware or software, it is all moot.

Heck even Creative Labs eventually fixed up the re sampling in their SB Live! so that its re sampling was OK, at least for a few passes.

I think the least incorrect way to compare 88.2 KHz and 44.1 Khz would be to record at a higher sample rate, e.g. 176.4 kHz, downsample to 88.2 kHz & 44.1 kHz, and then upsample both versions to a yet another sample rate before the listening test, for instance to 192 KHz. This would apply two sample rate conversions to each test sample.

I believe it would be preferable to minimize the number of unnecessary resamplings in such a test, but in this case I think it might be better to upsample the lower rate one to the higher one (a process which, according to my understanding, should be quite straightforward and transparent, at least for integer multiples) in the interest of keeping the playback hardware constant, for the reasons Arnold described. I don't know much about the internal workings of DACs, but I would be concerned that having one switch quickly and repeatedly between sampling rates as expected in such a test might push them into behaving in a non-ideal and unintended manner, especially if I'm not sure how other components might respond either. Keeping the possible sources of unintended or unpredictable variation at a minimum is an important goal in experiment design, and if you upsample something, at least you can know precisely what went on.

BTW, could some moderator perhaps split the post about the MP3 vs CD test please?

Upsampling and oversampling differ, at least in terms of purpose and implementation.

Quote

I also don't understand why a 192kHz DAC is supposedly cheaper to build. It is cheaper to build a good sounding 96kHz ADC than a 44.1kHz one, since the latter needs brickwall filtering.

Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)

Chip speed and real estate is so cheap that people will put features like 192 KHz sample rate into the chip to help make sure it gets bought no matter what. A chip that does not work well at 44.1 is crap in any design engineers book because that is the bread and butter. If you make it run at 192 then maybe it iwill sell a few more < $200 surround receivers to ignorant Joe six-pack types who think that anything with bigger numbers is better.

Quote

But oversampling DACs have quite relaxed filtering requirements already. So why should a 192kHz version be cheaper to build than a 44.1kHz version?

In fact what has happened is that 192/24 DAC chips have been under $1.00 in production quantities for years. I've found them in $50 DVD players. The circuit board space they sit on costs about as much as the chip if not more. You no longer pay a signficiant premium for > 44 KHz audio DACs. If you pay a premium, you pay it for improved dynamic range and low distortion. The premium you pay for magnificant performance is far less than it was even 5 years ago. There is stuff out there with 120 dB converters for less than $200.

Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)

That's not the way it works. The reconstruction filter is necessarily analog. A 192khz DAC doesn't need 90dB of rejection at 20kHz. It only needs it at 96kHz! On a DAC with that sampling rate, the filter is very gentle. It may hit the -3dB point at 20-25kHz and gently roll down to -90dB at 96kHz. But the 44.1kHz sampling rate DAC needs to roll down from 20kHz to 22.05 khZ - that's steep!

Actually, they both need and get brickwall filtering. Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist. Then they have the usual sharp cutoff. From a digital filter desgn viewpoint it is all pretty much the same. The gentle roll off is window dressing. They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist. Some DACs are programmable to work either way. How moot does that make things?

Putting a gentle ramp a few feet high in front of the brick wall does not mean a signficiantly gentler stop when you hit the brick wall! ;-)

That's not the way it works.

You've denied quite a bit, surely its not *all* wrong.

Quote

The reconstruction filter is necessarily analog.

Yes, but it is designed for the oversampling frequency. This puts it octaves above the nyquist frequency for the wave being converted.

Quote

A 192khz DAC doesn't need 90dB of rejection at 20kHz. It only needs it at 96kHz!

I never meant to say that and on review, I didn't say that. I said "They still need a fairly complex digital filter to get the 90+ dB rejection above Nyquist." Note, I said nyquist, not 20 KHz. What I did say about 20 KHz is "Some high sample rate DACs have a slow drop in response above 20 KHz, to like maybe 6 dB down at Nyquist." Now what I said was a bit in error about the "6 dB down at Nyquist", what I really meant is "6 dB down just a bit below Nyquist".

Quote

On a DAC with that sampling rate, the filter is very gentle. It may hit the -3dB point at 20-25kHz and gently roll down to -90dB at 96kHz.

It may do a lot of things (Like the Wolfson 8741) including exctly what I said.

Quote

But the 44.1kHz sampling rate DAC needs to roll down from 20kHz to 22.05 khZ - that's steep!

Actually, a number of AKM audio DACs are only 6 dB at Nyquist (22.05 KHz for 44.1 SR) The steep part of the slope is above Nyquist.

Benski, I just saw that you did not actually write "analog" filter. So the oversampling issue I have brought up does not make sense regarding costs. Oversampling severely limits costs for the analog part of the reconstruction filter, but you where talking about the required complexity of the digital side. And indeed a 192 kHz only DAC (no 44.1Khz support) could have a much simpler digital filter. The point, why I didn't get it, was that I did not consider digital filtering "expensive" these days. Still the misunderstanding was on my side. I hope I could clear that up.

It seems that what I wrote in the first part of my post here has been addressed since I read the thread, but the second part still might bare consideration.

I’m not following the discussions about brick-wall filters since I am under the impression that the majority of modern audio converters are some version of delta-sigma, which does not use such filters. Actual sample rates are high, 64X and 128X being typical, with digital decimation being used to achieve the final sample rate and prevent aliasing. The front-end analogue filters are described at “rather mild,” far short of what would be required to prevent aliasing in an non-oversamplng design.

My main reference is copyright 2000 (Ken Pohlmann’s Principles of Digital Audio, 4th edition), so something might have changed in this regard recently, but I’ve never seen anything about it. Is the discussion about brick-wall filters aimed at a special market or do I have some basic misunderstanding on this topic?

I would also like to throw something more into the discussion. Maybe explicit recognition of it could lead to some way to account for it in the testing. I’ve written about it in two earlier threads. Possibly some of the links to screen shots still exist in one of those.

I generated a sweep frequency signal at 88.2kz, covering the entire available frequency range. Playing it out the DAC of one soundcard and into the ADC of another, I found that I got back something very close to what was initially generated in CoolEdit (except with a SB Live).

Then, playing it at 88.2 and recording at 44.1kHz, I found a definite alias image. For the first couple kHz below the Nyquist Limit, hen recording at 44.2kHz, the image was quite strong. If I turned up the resolution in CoolEdit’s Spectral View, I could see the alias trace almost to zero Hz. Resampling the generated sweep tone in CoolEdit from 88.22 to 44.1 produced none of the aliasing.

I tested several reputable sound cards. All were the same (except the SB, which was much worse). When I initially posted these results in another forum there was a lot of noise about my failures until a couple of people with more expensive professional converters duplicated my results. This happened again later in another forum. While this is still only a small sample of all soundcards, I suspect they all do the same.

As discussed earlier in HA, I don’t think this is a significant audio defect. Certainly I cannot hear any distortion that may be produced, especially at 20kHz and above. Very little music has much content at the higher frequencies necessary to produce such aliasing anyway, and any such hf is likely to be rather low in intensity.

Still, this does make for a real difference in recording at higher sample rates vs. 44.1kHz. Aliasing cannot be removed from the recording. Since the alias image is only very strong (relative to the input signal) for a couple of kHz below the Nyquist Limit (meaning about 42kHz to 44kHz at a sample rate of 88.2kHz), even the little distortion that might be present in 44.1kHz recordings will be absent at 2X that, and above.

AndyH-ha, that might be on purpose. For two lowpass filters of identical computational complexity, with A being allowed to inject more imaging above f than B, et ceteris paribus, A has potentially higher passband performance (0-f Hz) than B.