FIRA Performance
The FIRA processing mainly consists of the following stages:
- TCB load
- Coefficient load
- Delay line preload
- Compute
- Index write back
FIRA runs at SCLK0 on ADSP-SC58x/2158x and ADSP-SC57x/2157x processors. FIR compute engine consists of 4 floating point MAC units which can provide performance upto 4 taps per sample per SCLK cycle which is equal to an effective performance of 1 core cycle per sample per tap with SCLK=CCLK/4.
In addition to this, the FIRA performance is also affected by one time DMA overheads needed for the stages 1, 2, 3, and 4.These DMA overheads become less significant as the block size and no. of taps increase. ADSP-SC57x processors improve the DMA performance with the help of burst mode support.
The below graphs show the FIRA performance measured on ADSP-SC589 and ADSP-SC573 EZ-Kit at CCLK=450 MHz and SCLK=112.5 MHz for different window (block size) and tap length values.
Fig 1.0 - FIRA Performance on SC57x
Fig 2.0 - FIRA Performance on SC58x
The following conclusions can be drawn from the above graphs:
- The measured performance tends towards 1 core cycles per sample per tap with larger window size and tap length values.
- The DMA overheads are lesser for ADSP-SC573 as compared to ADSP-SC589 due to usage of burst mode.
IIRA Performance
The IIRA processing mainly consists of the following stages:
- TCB load
- Coefficient load
- Compute
- Index write back
- Save state (optional)
IIRA runs at SCLK0 on ADSP-SC58x/2158x andd ADSP-SC57x/2157x processors. IIR compute engine consists of a single floating point MAC unit which can provide performance upto 1 biquad per 5 SCLK cycles which is equal to an effective performance of 20 core cycle per sample per biquad stage with SCLK=CCLK/4.
In addition to this, the IIRA performance is also affected by one time DMA overheads needed for the stages 1, 2, and 3.These DMA overheads become less significant as the block size and no. of biquad stages increase. ADSP-SC57x processors improve the DMA performance with the help of burst mode support. As IIRA has enough memory to store the coefficient and state variables for all the biquad stages, coefficient load can be optionally skipped once loaded. This can improve the overall performance further.
The below graphs show the IIRA performance measured on ADSP-SC589 and ADSP-SC573 EZ-Kit at CCLK=450 MHz and SCLK=112.5 MHz for different window (block size) and biquad stage values.
Fig 3.0 - IIRA Performance on SC57x
Fig 4.0 - IIRA Performance on SC58x
The following conclusions can be drawn from the above graphs:
- The measured performance tends towards 20 core cycles per sample per biquad stage with larger window size and tap length values.
- The DMA overheads are lesser for ADSP-SC573 as compared to ADSP-SC589 due to usage of burst mode.