Post Go back to editing

Issues with UART Transmission of Normalized Data from ADSP-SC594 SPORT Channels

Thread Summary

The user encountered an issue with ADSP-SC594 where channels 3 and 4 of the normalized data in shared L3 memory stopped updating after 16 elements when using a combined loop. The final solution involved explicitly marking the shared memory region as non-cacheable on the SHARC+ core using the adi_cache_set_range() API. Flushing the data buffers with flush_data_buffer() also resolved the issue, suggesting a potential cache coherence problem. The user confirmed that disabling caching and using separate loops for normalization worked, but the combined loop required explicit cache management.
AI Generated Content
Category: Software
Product Number: ADSP-SC594
Software Version: 3.0.3

Hi,

I'm working on an ADSP-SC594 project, using Core1 for signal acquisition and processing, and Core0 for UART data transmission. I'm acquiring data via four SPORT channels with group enable, and the acquisition works correctly.

Signal Processing on Core1

In the SPORT callback, I normalize the acquired data (rxBufferCh1, rxBufferCh2, rxBufferCh3, rxBufferCh4) using the following code:

for (int i = 0; i < DMA_BUF_SIZE; i++) {
    rxBufferCh1Nor[i] = rxBufferCh1[i] * step_voltage;
    rxBufferCh2Nor[i] = rxBufferCh2[i] * step_voltage;
    rxBufferCh3Nor[i] = rxBufferCh3[i] * step_voltage;
    rxBufferCh4Nor[i] = rxBufferCh4[i] * step_voltage;
}

The normalized arrays (rxBufferChxNor) are of type float and stored in shared L3 memory for access by Core0.

Issue

Core0 successfully transmits data from rxBufferCh1Nor and rxBufferCh2Nor via UART, but channels 3 and 4 (rxBufferCh3Nor, rxBufferCh4Nor) show incorrect data. Specifically, after approximately 15 or 16 elements, the normalized arrays for channels 3 and 4 stop updating and retain their initial values (e.g., zeros from the first acquisition), causing waveform distortion in the transmitted data.

Workaround

I found that using separate loops for each channel resolves the issue:

for (int i = 0; i < DMA_BUF_SIZE; i++) {
    rxBufferCh1Nor[i] = rxBufferCh1[i] * step_voltage;
}
for (int i = 0; i < DMA_BUF_SIZE; i++) {
    rxBufferCh2Nor[i] = rxBufferCh2[i] * step_voltage;
}
for (int i = 0; i < DMA_BUF_SIZE; i++) {
    rxBufferCh3Nor[i] = rxBufferCh3[i] * step_voltage;
}
for (int i = 0; i < DMA_BUF_SIZE; i++) {
    rxBufferCh4Nor[i] = rxBufferCh4[i] * step_voltage;
}

However, I don't understand why the combined loop causes issues for channels 3 and 4.

Details

  • Hardware: ADSP-SC594

  • Cores: Core1 handles acquisition and normalization; Core0 handles UART transmission

  • Memory: All normalized buffers are in shared L3 memory

  • SPORT Config: Four channels with group enable, acquisition verified correct

  • DMA_BUF_SIZE: 1024

  • UART: Configured for Core0, works for channels 1 and 2

  • step_voltage: Consistent across all channels

What I've Tried

  • Verified SPORT acquisition data in rxBufferChx (correct for all channels)

  • Checked step_voltage value (consistent and valid)

  • Inspected rxBufferChxNor contents; channels 3 and 4 stop updating every 16 elements

  • Confirmed UART configuration is correct (works for channels 1 and 2)

  • Tested separate loops for normalization, which resolves the issue

Questions

  1. Why does the combined loop fail to update rxBufferCh3Nor and rxBufferCh4Nor every 16 elements?

  2. Could this be related to memory alignment, cache coherence, or compiler optimization issues?

  3. Are there known issues with accessing shared L3 memory for multiple channels in ADSP-SC594?

  4. What debugging tools or techniques (e.g., CCES debugger) can help identify the root cause?

  5. Is there a more efficient way to normalize all channels without separate loops?

Any insights or suggestions to resolve this issue or explain the behavior would be greatly appreciated!

Best Regards!

Parents
  • Hi,

    In general, when the processor's data caches are enabled with write-back mode, the modified data is held in the cache is not written back to memory immediately to saving the cost of an external memory access. In this mode, it may be necessary to ensure that any modified data has been flushed to memory, so that external systems can access it. DMA transfers and dual-core accesses are common cases where write-back mode data would need to be flushed.

    If your results buffer resides in cacheable memory, then after modifying the buffer it should be flushed from memory using the “flush_data_buffer” API in the ADI_FLUSH_DATA_NOINV mode. This ensures that the DMA transfer or dual-core access does not use stale data.
    Please refer to the below example for your reference:
    ez.analog.com/.../example-code-to-maintain-data-coherency-between-core-and-dma-using-flush_5f00_data_5f00_buffer-api

    Also, some consideration is required when allocating memory for buffer. Please refer in the below CCES help for more information:
    CrossCore® Embedded Studio 3.0.1 > System Run-Time Documentation > Device Drivers User Guide > Low-Level Driver API Reference > Buffer Alignment and Cache Considerations

    We do not have a specific mechanism to identify the root cause of this issue. However, you can refer to the CCES Help documentation for general debugging features available in CrossCore Embedded Studio (CCES).
    CrossCore® Embedded Studio 3.0.1 > Integrated Development Environment > Debugging Executable Files > About Debugging Capabilities

    To assist you more effectively, could you please provide the following information:
    1. If Cache is enabled, please disable it, try using combined loops, and share the results with us.
    2. Is optimization enabled in your project?

    Regards,
    Nandini C

  • Hi,

    Thanks for the answer.

    I am defining the section of shared memory as uncached in the apt.c file from the core0:

    #define SHARC_L3 ADI_MMU_RO_UNCACHED
    
        /* Dynamic Memory Controller 0 (DMC0) 1GB SDRAM */
        { 0x80000000u, 0x9FFFFFFFu, SHARC_L3                    }, /* 512MB DDR-A */

    Despite the current configuration, I tried to flush the data buffers:

    flush_data_buffer(rxBufferCh1Nor, rxBufferCh1Nor+DMA_BUF_SIZE, ADI_FLUSH_DATA_NOINV);
    flush_data_buffer(rxBufferCh2Nor, rxBufferCh2Nor+DMA_BUF_SIZE, ADI_FLUSH_DATA_NOINV);
    flush_data_buffer(rxBufferCh3Nor, rxBufferCh3Nor+DMA_BUF_SIZE, ADI_FLUSH_DATA_NOINV);
    flush_data_buffer(rxBufferCh4Nor, rxBufferCh4Nor+DMA_BUF_SIZE, ADI_FLUSH_DATA_NOINV);
    

    And it appears to work now. Am I missing something in the memory file configuration?

    Best Regards!

  • Hi,

    You have correctly configured the shared memory region as uncached on the ARM core. However, on the SHARC+ core, you need to explicitly mark the same region as non-cacheable. This can be done using the adi_cache_set_range() API, which allows you to disable caching for a specific memory range at runtime.

    For more details, please refer to the CCES help documentation:
    CrossCore® Embedded Studio 3.0.1 > System Run-Time Documentation > Cache support > SHARC+ Caching Configuration and Support Functions > adi_cache_set_range

    The flush_data_buffer() function is only necessary when working with cached memory. If the buffer is placed in an uncached region, cache flushing is not required, which helps reduce the additional cycles associated with flush_data_buffer()

    Regards,
    Nandini C

Reply
  • Hi,

    You have correctly configured the shared memory region as uncached on the ARM core. However, on the SHARC+ core, you need to explicitly mark the same region as non-cacheable. This can be done using the adi_cache_set_range() API, which allows you to disable caching for a specific memory range at runtime.

    For more details, please refer to the CCES help documentation:
    CrossCore® Embedded Studio 3.0.1 > System Run-Time Documentation > Cache support > SHARC+ Caching Configuration and Support Functions > adi_cache_set_range

    The flush_data_buffer() function is only necessary when working with cached memory. If the buffer is placed in an uncached region, cache flushing is not required, which helps reduce the additional cycles associated with flush_data_buffer()

    Regards,
    Nandini C

Children
No Data