I'm currently trying to reproduce the getSpectrum function of the FMOD audio library. This function read the PCM data of the currently playing buffer, apply a window on this data and apply a FFT to get the spectrum.

It returns an array of float where each float is between 0 and 1 dB (10.0f * ( float)log10(val) * 2.0f ).

I'm not sure of what I do is what I should do so I'll explain it :

First, I get the PCM data in a 4096 bytes buffer, according to the documentation, PCM data is composed of samples which are a left-right pair of data.

In my case I'm working with 16bit samples like in the image above. So, if I want to work only with the left channel, I save the left PCM data in a short array doing :

1 Answer
1

You're a little confused about log (dB) scales - you don't get a range of 0 - 1 dB, you get a range of typically 96 dB for 16 bit audio, where the upper and lower end are somewhat arbitrary, e.g. 0 to -96 dB, or 96 dB to 0 dB, or any other range you like, depending on various factors. You probably just need to shift and scale your spectrogram plotting by a suitable offset and factor to account for this.

(Note: the range of 96 dB comes from the formula 20 * log10(2^16), where 16 is the number of bits.)

Thanks for your reply, but oddly, when applying 20 * log10(amp) I get values up to 120. It may be because of the (float) casting I do but not sure about that.
–
LowipMar 10 '12 at 11:39

It's really only the range that matters - dB is a ratio relative to some notional 0 dB reference - you just need to look at the min/max values that you are getting and shift/scale accordingly to get a reasonable intensity range on your spectrogram.
–
Paul RMar 10 '12 at 12:14

The Hann window above multiplies by a average value of .5. Also the max value of the signed 16 bit data is 2^15-1, not 2^16. Backing off -6dB for each of these halving of scale sets the peak dB at 84dB
–
Mark BorgerdingMar 10 '12 at 19:56