I was wondering if anyone had any insight to share about the intended usage-model of the HW accelerators? The concept of off-loading MIPs is smart and makes sense as a bullet-point, but it's not obvious to me how to leverage them in a real-time system.
For example, let's consider a classic audio system with block-based processing:
In this system, we have a chained DMA that periodically pulls in 32-samples from an ADC. When we get the DMA-complete interrupt, we do two things: 1) update some pointers so we can handle the new data, then 2) raise a lower priority interrupt [USER0, e.g.] where we do our audio processing.
When we RTI from our DMA-complete ISR, we vector to the signal processing thread [ISR]. Being a real-time system, we need to complete this processing and return to the while(1) loop in main() before the next DMA completes. It is HERE where the FFT's and filters live... in the "ProcessAudioBlock()" subroutines.
In my experience, this processing is usually very serial... e.g., maybe you start with a low-pass filter on the input, then maybe you do some bass-mgt (more filters), then you finish with some more filtering before outputting to a DAC. You necessarily have to do these in 1-2-3 order.
In such a archetypal SHARC system, where there's little parallelism in the signal-flow, how can we leverage these accelerators? Does anyone have an alternative architecture/thread-priority to the one I presented above? Just thinking out loud here, but if latency isn't a big concern, perhaps introducing a software/data 'pipeline' may provide an opportunity to use these.... <squint> ... o_0 ... <considers complex state-table> ... <head explodes>