Problems understanding FIR on ADSP-21369

I'm quite new to DSP and am trying to implement an adaptive FIR filter for active noise cancellation on the ADZS-21369-EZLITE evaluation Board.

The desired number of coefficients is around 300. The first thing that came to my mind, was whether the processing to adapt the coefficients (using LLMS) and generate an output sample is fast enough. I've adapted the block-based talk through (C) to give my processing function one sample at a time.

Programming my algorithm straightforward (without using compiler/hardware-specific code) got me two loops. The compiler optimized the first one to have 4 cycles (trip count is half the number of coefficients-1):

[Info] This loop executes 2 iterations of the original loop in estimated 4 cycles.

[Info] Trip count = 149

and the other one is executed in 9 cycles (trip count=no. of coefficients-1)

This gives me 3287 instructions to be carried out between to samples (for my estimation I ignored the overhead). Multiplying it with the instruction cycle time (2.5ns) tells me that the generation of each output sample takes around 8,2µs. Given a sample frequency of 96kHz (2 channels, each 48kHz) means that the SPORTisr is called every 10.4µs. Enough time for my algorithm... At least in theory. Can anyone who has a better understanding of DSP confirm my estimation? Problem is, when I run the code (even with only 150 coefficients) it stucks in the "ProcessingTooLong" function

After running into this problem I dug deeper into the compiler manual and found the builtin-functions for fir filtering. I tried to use them for my problem but I don't know how: I get an error signal from the ADC which I use to adapt the FIR coefficients. The FIR is then applied to a sound template which is stored in the SDRAM (the noise I'm targeting is periodic about every 0.5s). My understanding is, that I apply the coefficients to the template's samples (depends on how many coefficients) and get ONE new value which I then send to the DAC. Why does the fir_vec function alter an ARRAY of output samples?

Please excuse any using of wrong terminology and don't hesitate to correct it . Sadly there's no DSP lecture in my studies so I have to figure everything out by myself for my master's thesis :/.

1. The fir_vec as name suggests is a filters a vector (input buffer) generating buffer length of outputs(used for block processing). There is a scalar version of fir function as well, which i think you need. This sample based function returns a filtered output sample for an input sample. The State[] buffer needs to be zero initialized only very first time, and later, on every fir function call, current input sample will be pushed into this buffer so that state buffer would always contain the pervious Taps-1 input samples. [Please refer VDSP run-time library Manual for more details.]

In BlockBased Talkthru code, the CODEC and DSP's SPORTS are programmed for I2S mode,with 48Khz sample frequency. If you set Buffer size/ 'NUM_SAMPLES'=2(for L and R), you will be getting 2 samples(L and R) and interrupt every 20.833us. So 20.833us is the time available for processing the two samples. If you have changed the sample rate to 96Khz, time available for processing is 10.4us.

"This gives me 3287 instructions to be carried out between to samples (for my estimation I ignored the overhead). -----"

But the above estimation of processing time may be wrong, as all instructions may not get executed in one cycle. Also, you may have certain buffers in SDRAM, which are accessed during processing, this can affect instruction throughput.[For example: an isolated read from SDRAM, may cost around 6-12CCLK cycles, depending on Read Latency,Pre-charge,Bank Activation SDRAM related latencies involved]. A better approach in finding the execution time of your processing function is to use the Run time Library Cycle count functions[refer Run time Library Manual] or adding asm code as below.

asm("R4 = EMUCLK;");

-------processing function----

asm("R8 = EMUCLK;");

asm("R8= R8-R4;");

The other question about fir_vec function is not very clear, can you please eloborate, as i understand that fir_vec function does a block processing on an input buffer and returns a pointer to output array. Also note that the built-in compiler functions requires the input/co-effs/ouput buffers to be in internal memory, and input and co-effs in different internal memory blocks, to exploit single Cycle dual fetch, and SIMD features.

thanks for the tip concerning the count of cycles. I'm using START_CYCLE_COUNT() etc. now.

about the FIR: from my understanding, FIR filtering gives me ONE output sample (y_n) when using it on N input samples from the past (x_n-i), where N is the number of coefficients/taps. Just like in this formula:so why is fir_vec filling a whole array with output samples? Please don't get me wrong, I'm not saying this function is useless, I'm just trying to apply it to my problem and I don't know how .

Well, concerning the SDRAM speed issues, I've introduced a DMA transfer from SDRAM to internal memory. This allows me to run my filter with 274 coefficients. Unfortunately I was unable to get it to work with interrupts. What I'm doing now:

Attachments

1. The fir_vec as name suggests is a filters a vector (input buffer) generating buffer length of outputs(used for block processing). There is a scalar version of fir function as well, which i think you need. This sample based function returns a filtered output sample for an input sample. The State[] buffer needs to be zero initialized only very first time, and later, on every fir function call, current input sample will be pushed into this buffer so that state buffer would always contain the pervious Taps-1 input samples. [Please refer VDSP run-time library Manual for more details.]