When doing a memory to memory DMA transfer, Programming the DMA_BWLCNT register to zero to maximize the memory to memory bus transfer request, actually makes it the slowest possible transfer rate.
Looks like the DMA decrements first and then checks for zero.
So the bandwidth count needs to be set to 1 in order to get the fast memory DMA transfer speed. Here is what the current documentation says.
“The DMA_BWLCNT register contains a count that determines how often the DMA issues memory transactions. The
DMA loads the value from DMA_BWLCNT register into DMA_BWLCNT_CUR and decrements the current value each
SCLK cycle. When DMA_BWLCNT_CUR reaches 0x0000, the next request is issued, and the DMA reloads
DMA_BWLCNT_CUR. This bandwidth limit functionality is not applied to descriptor fetch requests. Programming
0x0000 allows the DMA to request as often as possible. 0xFFFF is a special case and causes requests to stop”.