Meta

Tag: Rectangular Window

In many of my earlier postings, I stated what happens in MP3-compressed sound somewhat inaccurately. One reason is the fact that an overview requires that information be combined from numerous sources. While earlier WiKiPedia articles tended to be quite incomplete on this subject, it happens that more-recent WiKi-coverage has become quite complete, yet still requires that users click deeper and deeper, into subjects such as the Type 4 Discrete Cosine Transform, the Modified Discrete Cosine Transform, and Polyphase Quadrature Filters.

What seems to happen with MP3 compression, which is also known as MPEG-2, Layer 3, is that the Discrete Cosine Transform is not applied to the audio directly, but that rather, the audio stream is divided down to 32 sub-bands in fact, and that the MDCT is applied to each sub-band individually.

Actually, after the coefficients are computed, a specific filter is applied to them, to reduce the aliasing that happened, just because of the PQF Filter-bank.

I cannot be sure that this was always how MP3 was implemented, because if we take into account the fact that with PQF, every second sub-band is frequency-inverted, we may be able to obtain equivalent results just by performing the Discrete Cosine Transform which is needed, directly on the audio. But apparently, there is some advantage in subdividing the spectrum into its 32 sub-bands first.

One advantage could be, that doing so reduces the amount of computation required. Another advantage could be the reduction of round-off errors. Computing many smaller Fourier Transforms has generally accomplished both.

Also, if the spectrum is first subdivided in this way, it becomes easier to extract the parameters from each sub-band, that will determine how best to quantize its coefficients, or to cull ones either deemed to be inaudible, or aliased artifacts.