i've been doing some performance tests on the filter accelerators of the SHARC 21469. I alco calculated the predicted outcome using the formula given in the HRM. To check on the troughput, I called a function that just waits till the accelerator finishes, this time would then be the runtime of the accelerator. Using the statistical profiling tool, I found the calculations be fairly off on the IIR Accelerator and off by a factor of two on the FIR Accelerator. Below I've listed a few cases with predicted and actual troughput. I'd be glad if anyone could comment on these, wether my calculations are wrong or I'm missing something to get the best out of the accelerators.
1. IIR Accelerator
- sample based, samplerate 192kHz, which leaves 2340something processor cycles for each sample (450M/192k)
- 8 channels with each 8 biquads should be 608 cycles total (36+5*8*1)*8 resulting in ~25%
-> The result of my tests were that only 4 channels of 8 biquads each could be processed, giving a processor load of about 90%
- block based, block size 512, samplerate 48kHz, which leaves 4800000 cylces for each block
- 24 channels with each 12 biquads -> 738144 cylces or ~15% processor load
-> results show approxemately ~30% processor load
What concerns me most is case 1. I'm programming a loudspeaker controller with a SHARC and using 192kHz samplerate I'd like to get the most out of the accelerators.
2. FIR Accelerator
Really only one important test, with maximum load capacity. Calculations indicate that 32 channels with 1024 taps should be possible on 48 kHz with blocksize 512 i.e., but again, my test results are maximum capacity to be only 512 taps on 32 channels, or 1024 taps on 16 channels, which ends in processor load of ~95%. Since I'm down a factor of 2, I thought there might be something fishy with me setting up the accelerators, so could someone quote wether my results are the intended capacity?
I can add code in case it's needed.
Thanks to all of you in advance