Post Go back to editing

MDMA between L2 to GPIO port


I'm trying to use a gpio port of the BF707 to read in a stream of data from a fast external ADC. To see what sample rates would be possible, I used MDMA0 in the opposite direction and sent a toggling content of a testBuffer to the gpio data port register. The BF707 is clocked as fast as possible (PLL, cclk, sclk0 = 800, 400, 100 MHz).

        for(i=0; i<100; i++) testBuffer[i] = (i&1)? 0:ADI_GPIO_PIN_8;
        *pREG_PORTA_DIR_SET   = ADI_GPIO_PIN_8;    // IOA8 output
        *pREG_DMA17_ADDRSTART = (void *) REG_PORTA_DATA;    // start address of MDMA destination
        *pREG_DMA16_ADDRSTART = testBuffer;        // start address of MDMA source
        *pREG_DMA16_XCNT = 100;        // 100 words
        *pREG_DMA17_XCNT = 100;
        *pREG_DMA17_XMOD = 0;        // increment = 0 (PortA data register MMR)
        *pREG_DMA16_XMOD = 2;        // increment = 2 (SRAM)
        *pREG_DMA16_BWLCNT = 0;        // set minimal bandwidth limit
        *pREG_DMA17_BWLCNT = 0;
        *pREG_DMA16_CFG  =
            ENUM_DMA_CFG_READ |        // read from memory
            ENUM_DMA_CFG_STOP |        // stop mode
            ENUM_DMA_CFG_MSIZE02 |    // 2*8=16bit width memory bus
            ENUM_DMA_CFG_PSIZE02 |    // 2*8=16bit width peripheral bus (=connection to MDMA1_DST)
            ENUM_DMA_CFG_SYNC;        // clear fifo at start
        *pREG_DMA17_CFG  =
            ENUM_DMA_CFG_WRITE |    // write to Port data register
            ENUM_DMA_CFG_STOP |        // stop mode
            ENUM_DMA_CFG_TRGWAIT |    // wait for trigger
            ENUM_DMA_CFG_MSIZE02 |    // 2*8=16bit width memory bus
            ENUM_DMA_CFG_PSIZE02 |    // 2*8=16bit width peripheral bus (=connection to MDMA1_SRC)
            ENUM_DMA_CFG_SYNC;        // clear fifo at start
        *pREG_DMA16_CFG |= ENUM_DMA_CFG_EN;    // enable the DMA
        *pREG_DMA17_CFG |= ENUM_DMA_CFG_EN;    // enable the DMA

        *pREG_TRU0_GCTL  = BITM_TRU_GCTL_EN;        // enable trigger propagating
//        *pREG_TRU0_SSR38 = TRGM_SOFT0_MST;            // slave 38 is MDMA0_SRC, assign it to master
        *pREG_TRU0_SSR39 = TRGM_SOFT0_MST;            // slave 39 is MDMA0_DST, assign it to master
        *pREG_TRU0_MTR   = BITM_TRU_MTR_MTR0 & TRGM_SOFT0_MST;    // software trigger for master0

        for(i=0; i<1000; i++);        // let the DMAs happen

The MDMA works as expected. It generates 50 square waves but VERY slow:

In the other direction, when feeding a port input line with a 2MHz square wave testsignal, I see 6 MDMAs happen during one period. That translates in a MDMA frequency of 12MHz.
Does anybody know why? There should n't be anything that slows down the MDMA - I'm doing a simple empty loop waiting for completion. And there are no obvious interrupts. The generated or received bit patterns are steady.

  • Here is the missing picture of the generated square wave:

  • in the datasheet of the BF70x I found the requirement, that the minimum input pulse width on a gpio port is 2 sclk0 = 20ns. That results in a max. input frequency of 25MHz. I guess the reality in the output direction is that the gpio requires 6 sclk0 cycles to update the ports.

    Not the MDMA is slow, but the GPIO itself. When sending a bit pattern directly to the port data register without using the MDMA, I can watch the 8.33MHz square wave signal again.

    Therefore the issue is NOT with the MDMA but should be renamed to:

                                                            fastest possible GPIO output frequency is only 8.333MHz. 

  • Hello ,

    As mentioned, the processor has a hierarchical memory structure comprising multiple memory spaces with different  access  speeds.  The  core  is  able  to  access  datain  L1  memory  in  one  core  cycleand  data  in  L2 memory in multiple core cycles, with longer access timesfor L3 memory. If data are in L2/L3 memory, the processor has to stall for multiple core cycles until the data isready. Using cache is one way to make an application execute  efficiently.  But if  too  many  cache  misses occur, theapplication may  not  achieve  the desired performance. To solve this, Direct Memory Access (DMA), can be used to allow data to be moved between L2/L3 memory and L1 memory without coreintervention. Additionally, as was the case with core buses simultaneously accessing different banks of memory, the same concept also holds true for the DMA channels. As the DMA engines use a dedicated bus for transfers, it will  also  compete  with  the  core  for  access  to  a  targeted  bank  of  memory.  Therefore,  choosing  different memory banks for receiving and sending data via DMA to remove conflicts with the core can improve the data throughput of an application.

    Please note that the optimize_for_speed pragma instructs is available for the compiler to optimize the functions for maximum speed.

    Please refer the below EE-394 application note for more information.

    Note: DO NOT forget to add #pragma optimize_as_cmd_line after the last line of the functions that need speed optimization,otherwise the subsequent functions that are supposed to be optimized for size will be optimized  for  speed.

    However, can you please confirm the below points for assist you better.
    1) Could you elaborate on what are you really trying to identify with this?
    2) Can you please elaborate more on your system with detailed explaination.
    3) Is your ADC parallel or Serial interfaced ?
    4) Do you aware of the high speed protocols available in the processor to communicate with your ADC ( such as SPI,SPORT, PPI)? 
    Please have a look at Table 18. Peripheral Clock Operating Conditions (PageNo:51/114). Please find the datasheet from below link,
    Normally the MDMA runs on the SCLK1 frequency in the ADSP-BF70x Processor.