Post Go back to editing

Cycle Count Difference in SHARC-FX When Function Definition and Call Are in Separate Files

Category: Hardware
Product Number: ADSP-21835

Dear Team,

I am working in 21834 processor (SHARC-FX), in simulation I am observing a difference in cycle counts when running the same function in two scenarios:

Are there any specific compiler settings, optimization behaviors, or inlining considerations in SHARC-FX that could cause such differences?

Please share the root cause of this behavior.

Thanks, and regards

Franky45

Thread Notes

Parents Reply
  • Hi,

    Apologies for the delay in response.

    Regarding Questions 1 and 3 -> We are checking this with our internal team and will get back to you as soon as we get a response from them.

    2 -> The main difference in SHARC-FX and why cycle count call takes longer is due to the differences in Processor architecture and that an extension to the original SHARC compatible implementation was made to support SHARC-FX. However SHARC-FX supports assembly to do cycle count reading directly through a XT_RSR_CCOUNT() call which reads a register using an assembly instruction which is more optimised.

    The overhead for the cycle count macro is much larger than it needs to be as it was designed to mimic the setup process and maintain similarity to the SHARC implementation as it was part of the migration for sharc+ projects.

    Regards,
    Santhakumari.V

Children
  • Hi,

    Please find the below holding response from internal team.

    We suspect that MMR writes on EHP stall until the register returns an acknowledgement while SHARC+ writes are fire-and-forget. We also found that a read of the same register takes 43 cycles.

    Could you please let us know how important is it that its fast to write? Does time matter for you? It takes a little longer to configure the DMA, but surely that isn’t done often.

    Internal team is working on it now and we will reply back, once we get a response from them.

    Thanks for your understanding.

    Regards,
    Santhakumari.V

  • Hi Santhakumari,

    1) Regarding MMR writes on EHP stalling until acknowledgement versus SHARC+ fire-and-forget behavior:

    Why do MMR writes on EHP stall until the register returns an acknowledgement, whereas SHARC+ writes are fire-and-forget and do not wait for any acknowledgement? Specifically, what mechanism allows SHARC+ to proceed without waiting, and why is the acknowledgement mandatory on SHARC-FX? Where we can find the details about this difference in architecture is there any specific documents to be referred

    2) Regarding the reference to DMA configuration:

    Our earlier question was not specific to DMA configuration. The main concern we raised in the previous discussion was about higher cycle counts observed on SHARC-FX when the function call and function definition are placed in different files, compared to when both are in the same file.

    The mention of DMA configuration seems unrelated to this particular observation.

    Our intent is to understand:

                 Why the cycle count increases when the function call and function definition are placed in different files, compared to when both are in the same file on SHARC-FX

    Clarification on this aspect would help us correctly attribute the root cause of the increased cycle counts we are measuring.

    Looking forward to your insights.

    Regards,

    Franky45