Best use of FFT

This is a discussion on Best use of FFT within the C++ Programming forums, part of the General Programming Boards category; Well, there isn't really a proper board for this, so I'll post it here... I'm sorry if this is the ...

Best use of FFT

Well, there isn't really a proper board for this, so I'll post it here... I'm sorry if this is the wrong section .

I've implemented FFT. I understand it. But my question is: how best to use it?
I'd like to input some audio samples and get the frequencies that are actually most present in this sound (or the part of the sound). I've tried to input a simple sine function and while it does report the proper frequency to be most present, the frequencies before and after it seem to be pretty 'present' as well. That is, the intensity of the proper frequency is highest, but those of a few hundred Hz more or less are quite high as well. Is there a proper way to distinguish properly between them?
Also, what is the best number of points (samples) to use?
And finally, what is the best way to input the data. Let's say I'm using 4096 points (samples) (I currently am). Should I input the first 4096 samples, then the next 4096? Or is there a better method, for instance: first input 4096, then the last 2048 of the last buffer and the next 2048 bytes, and so on? So, should the input overlap or not?

>> the frequencies before and after it seem to be pretty 'present' as well
I'd imagine that's because the signal isn't periodic. If the signal that is transformed is not periodic (i.e. an integer number of periods), power leaks into non fundamental frequencies, it's called spectral leakage. It can be reduced by windowing. A window forces the signal to be periodic by convolving the function kernel in time domain so the non-periodicy of the signal contributes less to the side bands.

As for N, I can't remember to be honest. 4096 seems high, but it depends on what it's being used for. The overlap you describe is called Welch's Method, I think. It gives good resolution by virtue of the overlap, but doesn't require as many points. Of course it increases computation but gives better results. If your application demands use it. Otherwise don't.

Also, if you don't have 4096 (for example) data points you can always zero-pad the buffer with 0's at the end. Can make the spectrum look better.

This looks OK to me (although I'm really unexperienced with audio processing, this is the first thing I'm doing). Even though the other frequencies show up a bit, it's easy to see around 450 Hz is most present.

So, sin alpha*sin beta = (1/2)(cos (alpha-beta) - cos(alpha+beta)) (hooray for trig). So you would get two different frequencies than what you had before. I would expect, if you want to combine what you've got, you should just add rather than multiply.

you'd get better results. It seems to me that 46.whatever is quite close to X sine periods, so you get great spikes from that. But 98 clearly isn't, which is why there is so much spectral leakage. If i*2.0*M_PI/46.545451475 is close to a period, then X*i*2.0*M_PI/46.545451475 will be too, with X being an integer. It's hard to think of a reason for why, but here's some top level maths about it.

Fourier transforms will convert any signal in the time domain to an infinite sum of sinusoids in the frequency domain. So if the input doesn't consist of a whole sine wave then the the input isn't periodic, which it needs to be for best results. This is why audio sample buffers you'd take would normally be small - your voice is repetitive over small times, but over larger (>one second), it isn't. This is also why the signal is often convolved in the time domain to force periodicy.

@twomers:
I could understand leakage may increase the intensity of the frequencies near the actual frequency, thinking about how FFT works. However, moving the peaks, I can't. True, I haven't studied it mathematically, just intuition.

Besides, Tabstop seems to be right here. Changing it to add up the sinusoids fixed it and put the peaks into exactly the right positions. It fixed my problem completely .

m37h0d: Thanks for the link. Although I wouldn't just join for one single question, unless I have to. Besides, I figured since there are a few really smart guys here, some will know the answer. And I was right . But should I have more questions, I'll certainly join there.

I've tried to input a simple sine function and while it does report the proper frequency to be most present, the frequencies before and after it seem to be pretty 'present' as well. That is, the intensity of the proper frequency is highest, but those of a few hundred Hz more or less are quite high as well. Is there a proper way to distinguish properly between them?

You need to learn some more basics. A discrete FFT obviously only has a finite number of bins. Yet you can represent more than this number of frequencies. So if some frequency is present which does not EXACTLY fall into one of the bins, it will be smeared across several bins.

If you need finer frequency resolution, you need a larger sized FFT. But a larger sized FFT decreases your temporal resolution. Tada, you've discovered the Uncertainty Principle (it's mathematically the same thing as the Uncertainty Principle from quantum mechanics). This mathematical fact is inescapable.

EDIT: The comment about non-periodicity also applies, but the effects of a signal which is not periodic-smooth is to manifest as very high frequencies, NOT frequencies close to the one you are looking for. You're probably seeing a combination of both effects.

The non-periodic noise can be reduced by applying a windowing function to your signal... This is all very basic DSP stuff. I suggest this online book: http://www.dspguide.com/