Friday, April 9, 2010

It didn't surprise me how impassioned a response I got to my post about converting audio data from integer to floating point. I posted a link to it on one mailing list and got some heated responses. Despite the extremely geeky nature of it, the fact is that I've seen this discussion on mailing lists before and it always seems to turn into a flame war. People put a lot of thought into implementing the simple conversion of audio from float to int and back and no matter what choice they make, they are invariably criticized for it, so it's only natural to be on the defensive.

While I contest that my post represents more thought and analysis (and better thought and analysis) than is available anywhere else publicly (certainly than I know of), I did not intend for it to be the be-all or end-all to the discussion, even if I implied otherwise. Some of the criticisms I received bordered on the absurd (it's true that my blog entry is not peer reviewed), while other criticisms were face-valid, but irrelevant (whether one solution is more pleasing mathematically is irrelevant if it is going to produce worse sounding results). However, digging though the criticisms it's apparent that some things from my analysis can be improved.

To that end, I'm going to use this entry to accumulate comments and thoughts that need to be made on the subject as they come up and/or need to be made. So this is a living blog post that will be updated and revised from time to time.

April 9th 2010

- I claimed that looking at the no DSP case was a "best case" situation, and that any DSP would only make whatever distortion occurred worse. Therefore, I argued, this was the only case that needed to be considered. Not everyone agrees with this, but it's also hard to generalize DSP. I might be worth analyzing some simple DSP like volume ramping.

- I contrasted the distortion produced by using the wrong conversion method to the distortion created by not using dither. However, error produced from truncation is most objectionable with low-level signals while only high-level signals were tested, so this is not a fair comparison.

- It would be worthwhile to test conversions from 24-bit to 16-bit of several different audio source types to determine if the harmonic distortion of the (2^n) model is relevant in that case.