Hi everyone,

i've been doing some performance tests on the filter accelerators of the SHARC 21469. I alco calculated the predicted outcome using the formula given in the HRM. To check on the troughput, I called a function that just waits till the accelerator finishes, this time would then be the runtime of the accelerator. Using the statistical profiling tool, I found the calculations be fairly off on the IIR Accelerator and off by a factor of two on the FIR Accelerator. Below I've listed a few cases with predicted and actual troughput. I'd be glad if anyone could comment on these, wether my calculations are wrong or I'm missing something to get the best out of the accelerators.

1. IIR Accelerator

Case 1:

- sample based, samplerate 192kHz, which leaves 2340something processor cycles for each sample (450M/192k)

- 8 channels with each 8 biquads should be 608 cycles total (36+5*8*1)*8 resulting in ~25%

-> The result of my tests were that only 4 channels of 8 biquads each could be processed, giving a processor load of about 90%

Case 2:

- block based, block size 512, samplerate 48kHz, which leaves 4800000 cylces for each block

- 24 channels with each 12 biquads -> 738144 cylces or ~15% processor load

-> results show approxemately ~30% processor load

What concerns me most is case 1. I'm programming a loudspeaker controller with a SHARC and using 192kHz samplerate I'd like to get the most out of the accelerators.

2. FIR Accelerator

Really only one important test, with maximum load capacity. Calculations indicate that 32 channels with 1024 taps should be possible on 48 kHz with blocksize 512 i.e., but again, my test results are maximum capacity to be only 512 taps on 32 channels, or 1024 taps on 16 channels, which ends in processor load of ~95%. Since I'm down a factor of 2, I thought there might be something fishy with me setting up the accelerators, so could someone quote wether my results are the intended capacity?

I can add code in case it's needed.

Thanks to all of you in advance

Arne

Hi Arne,

When trying to compare the expected and actual throughput, please take care of the fact that the expressions provided in the HRM for performance cycles are in terms of peripheral clock clyces (CCLK/2) and not in CCLK cycles. Another important point for the IIR accelerator case is that the cycles for loading coefficients are not included as it has to be performed only once by the accelerator.

Please let me know if you still see the difference between the expected and actual throughput for FIR and IIR accelertors. In that case please mention the complete configuration such as tap length, window size, no. of channels etc. so that I can test a simple code here to replicate the problem.

Hope this helps.

Thanks,

Mitesh