Post Go back to editing

ADSP-SC58x: how to debug a fatal error in the ARM core?

HI,

my project has a very very rare crash condition that breaks the ARM core. Using the ICE debugger I can see that the ARM core ends in the __fatal_error ASM routine, while the DSP cores keep on running.

I need to figure out how this happened, but there is no stack trace in the ARM if I stop after the crash. I have been reading the CCES 2.9.0 C/C++ Library Manual for SHARC Processors and I see that there are several global variables that could help me out: _adi_fatal_error_general_code, _adi_fatal_error_specific_code, and so on.

Unfortunately, it seems to me that these symbols are not present in the ARM symbol map, but only in the DSP  (I can see their address in the DSP .map.xml files).

Therefore 3 questions:

- how can I debug a fatal error in the ARM?

- shouldn't you clarify in the C++ Library Manual that the_adi_fatal_error_* global variables are available only in the DSP? If this is an error on the guide this is very disappointing.

Best regards

  • I have gone further and discovered that:

    - the ASM routine fatal_error is called by exceptions that set the registers R0 and R1

    Now I know the kind of error (runtime error 0x503).

    However, I would probably like to know the program counter where the error happened. Is this possible?

    One last question: I see from other forum posts that the CCES should give me a very detailed report like this:

    A non-recoverable error or exception has occurred.
    
    Description: Data Fault Exception - caused by attempting to access invalid data memory.
    
    General Type: RunTimeError
    
    Specific Type: ExceptAbrtData
    
    Error Message: If this is a synchronous fault, address 0x310010a0 held in Data Fault Address Register (DFAR) is the problem address.
    
    Error PC: 0xc1005124

    Why does this not happen to me? The debug session does not even stop and the Debugger console is empty ("No console to display at this time"). What is wrong? I had to figure out by myself that the ARM crashed, pause the processor and look at the disassembly to discover that the system was in fatal_error and was going to loop into a NOP forever.

    How can I have the detailed report printed in CCES?

  • Hi,

    The IDDE should automatically set a breakpoint at "__fatal_error", and print the report when this breakpoint is hit. Can you check that this breakpoint exists and is enabled? The automatic breakpoints are listed separately from user breakpoints - you can find them under the " Automatic Breakpoints" tab for your debug configuration. Regardless of this, you should find the address where the error occurred in R2 once the "fatal_error" function is called.

    Regarding the global variables, I've tried creating a small ARM executable and they seem to be defined (i.e. I can view them in the debugger's expressions window). Can you try accessing them *without* the leading underscore?

    Thanks,
    Kenny

  • Hi Kennie, apologies for reviving an old post but I thought it better than posting a separate similar thread.

    I have an SC573 running FreeRTOS on the Arm core and am experiencing a rare but pernicious  fatal error. Like the OP I have no report printed to the console, but could this be due to having semi-hosting disabled (as per Analog Devices FreeRTOS user guide)?

    I have confirmed that the fatal error breakpoint is set for core0.

    Is there any other way to diagnose the cause of a fatal error, as there is not call stack?

    Many thanks

    Connor

  • Hi Connor,

    While running application in freeRTOS environment, you have to redirect the console output to putty window. Please refer section "Run the example" from adi_freertos_user_guide.

    When a fatal error occurs, the following variables will have values assigned to them by the fatal error handling mechanism. You should be able to view these in the Expressions window once the fatal error breakpoint is hit:

    adi_fatal_error_general_code

    adi_fatal_error_specific_code

    adi_fatal_error_value

    adi_fatal_error_pc

    For more information, Please refer CrossCore® Embedded Studio 2.11.0 > ARM® Development Tools Documentation > Cortex-A > Analog Devices ARM Toolchain Manual > Analog Devices Run-time Library Support from Help path.

    Best Regards,
    Santhakumari.K

  • Hi Santhakumari,

    Thanks for your response.

    We can see the following text on the console:

    A non-recoverable error or exception has occurred.

      Description:   No dispatched handler available for the specified interrupt code.

      General Type:  RunTimeError

      Specific Type: NoDispatchedHandler

      Error Message: Enabled interrupt with ID (IID) 0x7fffffff raised when it has no handler registered.

      Error PC:      0xa0a7f4e4


    And the variables you pointed us towards confirm the above. However, the interrupt ID seems totally erroneous. We have previously encountered this and were advised by Maikel to edit the unhandled interrupt handler to the following:

    adi_rtl_unhandled_handler:

      PUSH {R4-R11, LR}

      MOV     R2,  R0                           /* ADDED BY MP! */

      LDR r0, =_AFE_G_RunTimeError                                                  /* Report a fatal error */

      LDR r1, =_AFE_S_NoDispatchedHandler

      LDR R4, =adi_fatal_error

      BLX R4

    However, we are still getting this erroneous value as the interrupt ID. Is there any other way to get the actual interrupt that is being raised?

    Many thanks, 

    Connor

     

  • Hi Connor,

    I'll ask the development teams and see if anyone recognises this. There is an error PC value in the console output (0xa0a7f4e4) - that's the address where the interrupt occurred. Does that give you any clues? You can enter that address directly in the disassembly window to get to the code.

    I did a quick check on the forum to see if anyone else has reported a similar issue, and there's one instance: RE: Is it OK to call adi_spi_Open() and adi_spi_Close() from inside of an interrupt?.
    It looks like it was a user error which probably resulted in invalid memory being accessed, so that might be a possibility in your case. Is there anything suspicious at the PC address mentioned above?

    Thanks,
    Kenny

  • Hi Kenny, 

    The PC value displayed unfortunately just points to the bottom of adi_rtl_unhandled_handler, where the fatal error call was made, making the whole situation a little circular. The real problem here is that the value show as the unhandled interrupt ID is not a valid interrupt ID number.

    The linked forum post is certainly interesting, and we will have a scour through or project to see if we can see any likely invalid accesses, however this isn't trivial in what is now quite a large project, with many tasks running. It could well be possible that one task is trying to access something that another task is using.

    Many thanks, 

    Connor

  • Hi Connor,

    I've spoken to a colleague and there's an issue with the ARM fatal error handling mechanism which can result in the error message being incorrect. This will be fixed in an upcoming release of CCES (probably 2.11.1), but I've included an updated source file which contains the fix. This can be dropped into your project, and will be included in the executable when you rebuild. If you re-run the executable and reproduce the error message, it will hopefully give some more information.

    I tried to attach the file, but there doesn't seem to be any easy way to do that. :-/

    Note: This code has been through our internal review and testing processes, but not the full release process. This means it shouldn't be treated as production code.

    Thanks,
    Kenny

    /*
     * Copyright (c) 2014-2022 Analog Devices, Inc.  All rights reserved.
     *
     * Redistribution and use in source and binary forms, with or without
     * modification, are permitted (subject to the limitations in the
     * disclaimer below) provided that the following conditions are met:
     *
     * * Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     * * Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the
     *    distribution.
     *
     * * Neither the name of Analog Devices, Inc.  nor the names of its
     *    contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     * NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE
     * GRANTED BY THIS LICENSE.  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT
     * HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
     * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
     * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
     * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
     * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
     * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
     * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
     * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
     * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
     * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
     * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     */
    
    #include <runtime/int/interrupt.h>
    #include <stddef.h>
    #include <stdint.h>
    #include "adi_rtl_unhandled.h"
    
    /* Static function prototypes */
    static inline int32_t get_iid_index(uint32_t iid);
    
    /* Dispatched interrupt vector table */
    extern adi_dispatched_data_t adi_dispatched_int_vector_table[ADI_DISPATCHED_VECTOR_TABLE_SIZE];
    
    /* Returns the dispatched interrupt vector table index for a give IID. */
    static inline int32_t get_iid_index(uint32_t iid)
    {
        int32_t table_index = -1;
    
        /* Get the table index */
        if (ADI_RTL_IID_IS_EXCEPTION(iid)) {
            table_index = ADI_RTL_EXC_IID_TO_INDEX(iid);
        } else {
            table_index = ADI_RTL_IID_TO_INDEX(iid);
        }
        return table_index;
    }
    
    /* Register a handler for the given interrupt id, the same API can be used to register the
       handler for exceptions, IRQs.
    */
    int32_t adi_rtl_register_dispatched_handler(uint32_t _iid,
                                                adi_dispatched_handler_t _handler,
                                                adi_dispatched_callback_arg_t _callback_arg)
    {
        int32_t index = -1;
    
        /* Get the table index */
        index = get_iid_index(_iid);
    
        /* Verify that IID is within the valid range */
        if((index < 0) || (index >= ADI_DISPATCHED_VECTOR_TABLE_SIZE))
            return -1;
    
        adi_rtl_disable_interrupts();
    
        /* Save the given handler and callback arguments */
        adi_dispatched_int_vector_table[index].handler = _handler;
        adi_dispatched_int_vector_table[index].callback_arg = _callback_arg;
    
        adi_rtl_reenable_interrupts();
    
        return index;
    }
    
    /* Unregister the dispatched handler. */
    int32_t adi_rtl_unregister_dispatched_handler(uint32_t _iid)
    {
        int32_t index = -1;
        adi_dispatched_handler_t default_handler;
    
        /* Get the table index */
        index = get_iid_index(_iid);
    
        /* Verify that IID is within the valid range */
        if((index < 0) || (index >= ADI_DISPATCHED_VECTOR_TABLE_SIZE))
            return -1;
    
        /* Reset the handler and callback to defaults */
    #if defined(__ADSPCORTEXA5__)
        if (_iid == ADI_RTL_XID_UND) {
            /* Undefined instruction */
            default_handler = adi_rtl_unhandled_except_und;
        } else if (_iid == ADI_RTL_XID_SVC) {
            /* Supervisor Call */
            default_handler = adi_rtl_unhandled_except_svc;
        } else if (_iid == ADI_RTL_XID_ABORT_PREFETCH) {
            /* Prefetch abort */
            default_handler = adi_rtl_unhandled_except_abort_prefetch;
        } else if (_iid == ADI_RTL_XID_ABORT_DATA) {
            /* Data abort */
            default_handler = adi_rtl_unhandled_except_abort_data;
        } else {
            default_handler = adi_rtl_unhandled_handler;
        }
    #elif defined(__ADSPCORTEXA55__)
        if (_iid == ADI_RTL_XID_SYNC) {
            /* Synchronous exception */
            default_handler = adi_rtl_unhandled_except_sync;
        } else if (_iid == ADI_RTL_XID_SERR) {
            /* System error */
            default_handler = adi_rtl_unhandled_except_serror;
        } else {
            default_handler = adi_rtl_unhandled_handler;
        }
    #endif
    
        adi_rtl_disable_interrupts();
    
        adi_dispatched_int_vector_table[index].handler = default_handler;
        adi_dispatched_int_vector_table[index].callback_arg = NULL;
    
        adi_rtl_reenable_interrupts();
    
        return 0;
    }
    
    /* Initializes the dispatched vector table with default interrupt handlers. */
    void _init_dispatch_tables(void)
    {
        uint32_t i;
        for(i = 0u; i < (uint32_t)ADI_DISPATCHED_VECTOR_TABLE_SIZE; i++)
        {
            /* Save the given handler and callback arguments */
            adi_dispatched_int_vector_table[i].handler = adi_rtl_unhandled_handler;
            adi_dispatched_int_vector_table[i].callback_arg = NULL;
        }
    
        /* Use specific default handlers for non IRQ/FIQ exceptions. */
    #if defined(__ADSPCORTEXA5__)
        adi_dispatched_int_vector_table[get_iid_index(ADI_RTL_XID_UND)].handler = adi_rtl_unhandled_except_und;
        adi_dispatched_int_vector_table[get_iid_index(ADI_RTL_XID_SVC)].handler = adi_rtl_unhandled_except_svc;
        adi_dispatched_int_vector_table[get_iid_index(ADI_RTL_XID_ABORT_PREFETCH)].handler = adi_rtl_unhandled_except_abort_prefetch;
        adi_dispatched_int_vector_table[get_iid_index(ADI_RTL_XID_ABORT_DATA)].handler = adi_rtl_unhandled_except_abort_data;
    #elif defined(__ADSPCORTEXA55__)
        adi_dispatched_int_vector_table[get_iid_index(ADI_RTL_XID_SYNC)].handler = adi_rtl_unhandled_except_sync;
        adi_dispatched_int_vector_table[get_iid_index(ADI_RTL_XID_SERR)].handler = adi_rtl_unhandled_except_serror;
    #endif
    }

  • Hi Connor,

    I as just wondering if you've had a chance to try this code. If so, did it help?

    Thanks,
    Kenny