I would like to process 12 to 20 seconds of incoming audio at a sample rate of 44100. I must process this audio in real time in an STM embedded kit (perhaps also an Android Smartphone). I'm trying to detect and count the number of occurrences of a signal of roughly 6500 samples inside the incoming audio. The maximum FFT available is of 1024 samples.

I was thinking about applying overlap-add but the number of coefficients would be 6500 and that's larger than maximum FFT size of 1024. I tried to simulate this in Matlab using fftfilt but the function help says:

If you supply a value for n, fftfilt chooses an FFT length, nfft, of 2^nextpow2(n)and a data block length of nfft - length(b) + 1. If n is less than length(b), fftfilt sets n to length(b).

This makes me think that I'm forced to use an FFT of at least 6500 samples (which I can't) and then process 1 incoming audio sample at a time (super inefficient).

$\begingroup$Along with the technique pointed out by Stanley in his answer below, another option is partitioned convolution. Split your long filter into shorter sections and implement each as a separate filter. Then, delay and sum the filter outputs appropriately to reconstruct the response you would have gotten from using the long filter to begin with.$\endgroup$
– Jason RAug 2 '18 at 19:26

$\begingroup$are you filtering with an FIR filter of 6500 taps? is that what you're doing. is this a matched filter problem?$\endgroup$
– robert bristow-johnsonAug 2 '18 at 21:40