[#4273] Double faults are impossible to debug

Document created by Aaronwu Employee on Aug 29, 2013
Version 1Show Document
  • View in full screen mode

[#4273] Double faults are impossible to debug

Submitted By: Robin Getz

Open Date

2008-07-23 17:45:04     Close Date

2008-08-29 04:06:58

Priority:

Medium     Assignee:

Mike Frysinger

Robin Getz

Status:

Closed     Fixed In Release:

N/A

Found In Release:

N/A     Release:

Category:

N/A     Board:

N/A

Processor:

N/A     Silicon Revision:

Is this bug repeatable?:

Yes     Resolution:

Fixed

Uboot version or rev.:

    Toolchain version or rev.:

App binary format:

N/A     

Summary: Double faults are impossible to debug

Details:

 

With today's kernel/U-Boot (both trunk and branch), when a double fault occurs, the system hangs forever until the Watchdog (if enabled) resets the device. When this happens, the SWRST says that we were reset by watchdog, and gives no clues to where things actually went wrong.

 

The hardware does have the capabilities to do better.

 

Software Reset Register (SWRST)

 

Core Double Fault Reset Enable

0 - No reset caused by Core Double Fault

1 - Reset generated upon Core Double Fault

 

If an exception occurs in an event handler that is already servicing an Exception, NMI, Reset, or Emulation event, this will trigger a double fault condition, and the address of the excepting instruction will be written to RETX.

 

In U-Boot, we do store it, and pass it up to u-boot/u-boot-1.1.6/lib_blackfin/board.c:board_init_f(ulong bootflag) as the parameter bootflag- but then we don't do anything with it...

 

So, all we need to do is:

- set the Core Double Fault Reset Enable bit in the SWRST

- have U-Boot pass RETX to the kernel somehow.

- have the kernel print out the value if it is recovering from a double fault

 

This will change the ABI between the Kernel and the Bootloader. This could be done with either appending a simple ascii "RETX=0x00000000" in the bootargs (where the kernel can parse it like normal), or predefining a "special" location in scratchpad SRAM - like we do with the bootargs start address.

 

I think I prefer the special location - that will not add noise to anything, and doesn't break when you have built in command line.

 

 

 

Follow-ups

 

--- Mike Frysinger                                           2008-07-23 22:07:13

ive already documented the ABI for passing RETX.  U-Boot today will save RETX

and pass it up to the higher C levels, but it doesnt currently pass it to the

kernel.  i think passing RETX to the kernel via RETX is OK.

 

--- Robin Getz                                               2008-07-24 10:56:04

OK - that works for me.

 

That means in U-Boot's board_init_f we need to save RETX (bootflag) somewhere,

and then in  lib_blackfin/boot.c:do_bootm_linux() after we disable the caches,

we load it back into RETX.

 

And we need to modify all the kernel's arch-bf*/head.S to save RETX, so it

doesn't get clobbered between when we turn the caches on, and when we print

things out. (which looks like it would be OK today, just to print it out without

caching it - since we turn caches on so late in the boot process in

kernel/setup.c)

 

?

 

--- Robin Getz                                               2008-07-25 16:52:13

This should work for the kernel side (I think). Right now, it just prints out

the last exception from U-Boot.

 

Index: arch/blackfin/kernel/setup.c

===================================================================

--- arch/blackfin/kernel/setup.c        (revision 5039)

+++ arch/blackfin/kernel/setup.c        (working copy)

@@ -773,9 +773,15 @@

        /* If we double fault, reset the system - otherwise we hang forever */

        bfin_write_SWRST(DOUBLE_FAULT);

 

-       if (_bfin_swrst & RESET_DOUBLE)

-               printk(KERN_INFO "Recovering from Double Fault

event\n");

-       else if (_bfin_swrst & RESET_WDOG)

+       if (_bfin_swrst & RESET_DOUBLE) {

+               /*

+                * make sure we print this out before init_exception_vectors()

where

+                * retx will get clobbered

+                */

+               unsigned int retx;

+               asm volatile ("%0 = RETX": "=r"(retx) : );

+               printk(KERN_INFO "Recovering from Double Fault event at

0x%08x\n", retx);

+       } else if (_bfin_swrst & RESET_WDOG)

                printk(KERN_INFO "Recovering from Watchdog

event\n");

        else if (_bfin_swrst & RESET_SOFTWARE)

                printk(KERN_NOTICE "Reset caused by Software

reset\n");

Index: arch/blackfin/kernel/early_printk.c

===================================================================

--- arch/blackfin/kernel/early_printk.c (revision 5036)

+++ arch/blackfin/kernel/early_printk.c (working copy)

@@ -192,7 +192,7 @@

         * Note - don't change RETS - we are in a subroutine, or

         * RETE - since it might screw up if emulator is attached

         */

-       asm("\tRETI = %0; RETX = %0; RETN = %0;\n"

+       asm("\tRETI = %0; RETN = %0;\n"

                : : "p"(early_trap));

 

}

 

--- Mike Frysinger                                           2008-07-28 05:13:26

u-boot trunk now passes RETX down when booting the kernel

 

the kernel should prob read retx in head.S to avoid nasty bitrot troubles like

early printk or something else down the line stomping on RETX ...

 

--- Mike Frysinger                                           2008-08-29 05:06:57

things are nice and debuggable now in trunk

 

--- Robin Getz                                               2008-08-29 08:21:09

Just a note - you need trunk U-Boot (anything after revision 1335) for this to

work properly - but it does work - and helped solved at least one tricky bug so

far...

 

-Robin

 

 

 

    Files

    Changes

    Commits

    Dependencies

    Duplicates

    Associations

    Tags

 

File Name     File Type     File Size     Posted By

No Files Were Found

Attachments

    Outcomes