I want to perform FFTs using the 21469 accelerator on block data that I receive from SPORT in the multi-channel configuration. This means that the data block is channel interleaved. I understand from https://ez.analog.com/message/24289#24289 that the FFT input DMA TCB can be configured as a circular buffer with the modify value set to the number of channels. In this manner, the DMA will de-interleave the data as it is written into the FFT accelerator's dedicated memory. This all makes perfect sense at a high level, but the details are tricky and the documentation is sparse. I have not been able to get this working. I'm only trying to do 256 sample FFTs, but I thought that the expert code generator's 512 FFT example might help by providing an example on how to use FFT circular DMA to perform FFTs across the vertical columns in a butterfly FFT scheme. But it doesn't make sense to me. Here are the circular buffer TCBs from the 512 example:
/*TCB for loading 2*N-1 Input data points cofficients (modifier = 2*H) */
FFT_VIP_TCB[0]= 0;
FFT_VIP_TCB[1]= (int)FFT_IP_buff; //Circular Buffer Base
FFT_VIP_TCB[2]= 1023; //Circular Buffer Length
FFT_VIP_TCB[3]= 1023; //Count
FFT_VIP_TCB[4]= 64; //Modify
FFT_VIP_TCB[5]= (int)FFT_IP_buff; //Index
/*TCB for loading 2*N-1 Input data points cofficients (modifier = 2*H) */
FFT_VLASTIP_TCB[0]= 0;
FFT_VLASTIP_TCB[1]= (int)FFT_IP_buff;
FFT_VLASTIP_TCB[2]= 1024;
FFT_VLASTIP_TCB[3]= 1;
FFT_VLASTIP_TCB[4]= 1;
FFT_VLASTIP_TCB[5]= (int)FFT_IP_buff+1023;
I understand that the circular buffer length needs to be less than the data block size because of the need to shift to the next column/channel after reading in all of the samples in the previous column/channel. I also understand that this is the reason why you must follow the circular buffer TCB with a second TCB that only reads the very last sample of the last column/channel. What is confusing is that the 512 FFT is operating on packed complex samples (each sample is real immediately followed by imaginary). Because the data is complex, I would expect that the circular buffer length would be 2 less than the block size (not 1 less like the example). Alternatively, I feel like the DMA needs to be transferring 2 samples at a time, so the TCB size should be N-1, not 2N-1 (or 2N-2 as I suggested). Please somebody help me!!!!
I'm beginning to think that the only way to de-interleave the data via the FFT DMA is to use the unpacked format. That seems to resolve all of my issues with the TCB set up. I will try this tomorrow. It seems to me that even though packed data is required for N>256 FFTs, the DMA is actually unpacking the data as it is written into the FFT's dedicated memory. I may have just answered my own question.
I am also curious about the rules for chaining TCBs together to queue up multiple FFTs with different DMA configurations. I have had success with batching several ffts using FFT_RPT bit and an single TCB that reads multiple data sets (with a similar TCB on the output). I have not had much luck chaining multiple TCBs together to define different input and output buffer locations for example. I assume that this should be possible. Is there a trick I don't know about?