How to use SIMD to speed up FIR filter with circular buffering

Hello DSPers,


I can't figure out how to use SIMD on a 21488 to speed up an FIR filter computation.  Here's why:  The data being processed is entered into a circular buffer. so every other data entry lands on an ODD address.  How can I process this data in SIMD pairs when SIMD can only access pairs of addresses lined up at an even address boundary?  It seems like I need to swap the data pairs around on every odd pass through the circular buffer or something but I'm afraid that might cancel the speed benefit of SIMD!


If anyone has any good ideas here, please let me know!