[#4917] KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit.

Document created by Aaronwu Employee on Aug 30, 2013
Version 1Show Document
  • View in full screen mode

[#4917] KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit.

Submitted By: Sonic Zhang

Open Date

2009-02-19 00:00:29     Close Date

2010-01-14 02:16:24

Priority:

Medium     Assignee:

Mike Frysinger

Sonic Zhang

Status:

Closed     Fixed In Release:

N/A

Found In Release:

2009R1-RC6     Release:

Category:

N/A     Board:

N/A

Processor:

BF561     Silicon Revision:

Is this bug repeatable?:

Yes     Resolution:

Fixed

Uboot version or rev.:

    Toolchain version or rev.:

SVN trunk 4.1

App binary format:

N/A     

Summary: KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit.

Details:

 

KGDB single step into the middle of a 4-byte instruction on bf561 after soft bp is hit at this 4-byte instruction. No problem with hardware bp.

 

The configuration is default with KGDB and KGDB_TESTS enabled.

 

CONFIG_KGDB=y

CONFIG_KGDB_TESTS=y

CONFIG_SCHED_HRTICK=y

CONFIG_TICK_ONESHOT=y

# CONFIG_NO_HZ is not set

CONFIG_HIGH_RES_TIMERS=y

 

 

Wrong output after hit software bp:

 

(gdb) c

Continuing.

cat /proc/kgdbtest

L1(before change) : data variable addr = 0xff800000, data value is 0

L1 : code function addr = 0xffa003dc

L1(after change) : data variable addr = 0xff800000, data value is 10

L2(before change) : data variable addr = 0xfeb00084, data value is 0

L2 : code function addr = 0xfeb00000

L2(after change) : data variable addr = 0xfeb00084, data value is 20

L1(before change) : data variable addr = 0xff800000, data value is 10

L1 : code function addr = 0xffa003dc

L1(after change) : data variable addr = 0xff800000, data value is 20

L2(before change) : data variable addr = 0xfeb00084, data value is 20

L2 : code function addr = 0xfeb00000

L2(after change) : data variable addr = 0xfeb00084, data value is 40

 

Correct output without software bp:

 

(gdb) c

Continuing.

cat /proc/kgdbtest

L1(before change) : data variable addr = 0xff800000, data value is 0

L1 : code function addr = 0xffa003dc

L1(after change) : data variable addr = 0xff800000, data value is 10

L2(before change) : data variable addr = 0xfeb00084, data value is 0

L2 : code function addr = 0xfeb00000

L2(after change) : data variable addr = 0xfeb00084, data value is 20

 

Follow-ups

 

--- Sonic Zhang                                              2009-02-23 04:22:58

If turn off ICACHE, this bug disappear.

 

--- Sonic Zhang                                              2009-02-23 05:38:04

This bug is confirmed to be an issue in the icache invalidate code. If following

sequence is met, the wrong instruction is loaded.

 

1) The one instruction address is cached in the icache.

2) This instruction in SDRAM is changed.

3) IFLASH[P0] is executed only once in blackfin_icache_flush_range().

4) This instruction is executed then.

 

This issue is walked around properly till Mike clean up the cache flush code in

commit 5419.

------------------------------------------------------------------------

r5419 | vapier | 2008-10-15 03:05:48 +0800 (Wed, 15 Oct 2008) | 1 line

 

unify/cleanup cache code to (1) be correct wrt to end ranges and (2) be optimal

with a one-instruction hardware loop

------------------------------------------------------------------------

Should add the walkarround back.

 

--- Sonic Zhang                                              2009-02-23 05:52:26

Fixed.

 

--- Mike Frysinger                                           2009-02-23 11:23:39

this doesnt make sense.  either the icache gets flushed, or it doesnt.

executing iflush twice on the same address should make no difference at all.

 

you've changed the code so it does:

IFLUSH[start];

start = start & -L1_CACHE_BYTES;

IFLUSH[start];

 

so even if start is in the middle of a cache line (which it rarely is), the

IFLUSH should still hit the same cache line.

 

--- Bernd Schmidt                                            2009-02-23 11:38:46

Were there any other workarounds in that file that were lost in r5419?

 

--- Mike Frysinger                                           2009-02-23 11:49:22

this isnt a documented issue.  this is "it seems to work" but there's

no real information.

 

other csync/ssync's were dropped, but those all applied to old Blackfins we do

not support.

 

--- Robin Getz                                               2009-02-23 12:21:12

Yeah, I not sure I follow how this is a fix.

 

-Robin

 

--- Sonic Zhang                                              2009-02-23 23:16:40

IFLUSH twice is the only walkaround found lost in r5419.

 

This is a walkaround to fix this bug. I don't find a documented anomaly yet.

 

Yes, the 2 IFLUSH hit the same cache line. The bug occurs if this cache line is

flushed only once. Any idea on the real root cause?

 

--- Mike Frysinger                                           2009-02-23 23:27:51

can you describe exactly what's going on so we can be reproduced w/out linux let

alone kgdb

 

--- Sonic Zhang                                              2009-02-23 23:49:36

You can reproduce it in following steps:

 

0) config DCACHE in WT mode and enable ICACHE.

1) In following kegb_test() function, replace the first 2 bytes of the 4-byte

instruction at address 0x8866 with 2-byte instruction 0xa1.

2) call kgdb_test() for first time

3) exception is hit

4) in trap.c replace 2-byte instruction 0xa1 with the backup 2 bytes.

5) invalidate the instruction from addr to addr + 2, IFLUSH only once.

5) minus PC by 2 on saved register stack and return from the exception.

6) First instruction loaded after return is the 2-byte instruction 0xa1, not

the original 4-byte instruction.

 

 

Dump of assembler code for function kgdb_test:

0x0000885c <kgdb_test+0>:       LINK 0x14;              /* (20) */

0x00008860 <kgdb_test+4>:       [FP + 0x8] = R0;

0x00008862 <kgdb_test+6>:       [FP + 0xc] = R1;

0x00008864 <kgdb_test+8>:       [FP + 0x10] = R2;

0x00008866 <kgdb_test+10>:      R1.H = 0x10;            /* ( 16)      

R1=0x0x100000 <unix_dgram_poll+56>(1048576) */

0x0000886a <kgdb_test+14>:      R1.L = 0x688c;          /* (26764)    

R1=0x0x10688c(1075340) */

0x0000886e <kgdb_test+18>:      R2 = [FP + 0xc];

0x00008870 <kgdb_test+20>:      R0 = [FP + 0x10];

0x00008872 <kgdb_test+22>:      [SP + 0xc] = R0;

0x00008874 <kgdb_test+24>:      R0 = [FP + 0x14];

0x00008876 <kgdb_test+26>:      [SP + 0x10] = R0;

0x00008878 <kgdb_test+28>:      R0 = R1;

0x0000887a <kgdb_test+30>:      R1 = R2;

0x0000887c <kgdb_test+32>:      R2 = [FP + 0x8];

0x0000887e <kgdb_test+34>:      P2.H = 0x1;             /* (  1)      

P2=0x0x10000 <do_fork+476> */

0x00008882 <kgdb_test+38>:      P2.L = 0xcd8;           /* (3288)     

P2=0x0x10cd8 <printk> */

0x00008886 <kgdb_test+42>:      CALL (P2);

0x00008888 <kgdb_test+44>:      R0 = [FP + 0x14];

0x0000888a <kgdb_test+46>:      [FP + 0x10] = R0;

0x0000888c <kgdb_test+48>:      R0 = [FP + 0x10];

0x0000888e <kgdb_test+50>:      UNLINK;

0x00008892 <kgdb_test+54>:      RTS;

 

--- Mike Frysinger                                           2009-02-23 23:53:00

thanks, i think that's enough for me to try and reproduce with like u-boot/jtag

 

--- Sonic Zhang                                              2009-02-23 23:59:58

Correct one line.

 

6) First instruction loaded after return is a 2-byte instruction, may not be

0xa1, but also not the original 4-byte instruction.

 

--- Robin Getz                                               2009-02-24 07:07:28

I know there were some undocumented issues about 32-bit or 64-bit instructions

getting replaced by a 16-bit breakpoints, and having the remaining

"tail" opcodes being decoded in the pipeline, and causing issues like

05000281.

 

But I don't think a flush would fix that. Would be interesting to understand if

a CSYNC also fixed things.

 

--- Sonic Zhang                                              2009-02-25 00:41:21

Both CSYNC and SSYNC don't help.

 

--- Mike Frysinger                                           2009-02-27 02:12:19

does this only occur on the bf561-ezkit ?  or have you checked other boards ?

 

--- Sonic Zhang                                              2009-02-27 03:06:04

No. This was initially discovered on BF533-STAMP and BF537-STAMP.

 

--- Mike Frysinger                                           2009-02-27 16:55:00

so how exactly is this test working on the board ?  the original summary has a

dearth of information ...

 

to try to reproduce, i took a BF537-EZKIT with 0.3 silicon and:

$ make AnalogDevices/BF537-STAMP_default

$ make linux_menuconfig

<enable CONFIG_KGDB and CONFIG_KGDB_TESTS>

$ make

 

then i booted the kernel and ran:

root:/> cat /proc/kgdbtest

L1(before change) : data variable addr = 0xff800000, data value is 0

L1 : code function addr = 0xffa00390

L1(after change) : data variable addr = 0xff800000, data value is 10

 

then i ran the same `cat` command over and over and every time i would get the

same three lines of output

 

presumably something else needs to be happening here ... what is that ?

 

--- Mike Frysinger                                           2009-02-28 14:35:42

ok, i read through the kgdb expect script and found out how to reproduce this

 

my tests though indicate this is a problem with kgdb_disable_hw_debug().  if i

remove the writes of 0 to WPIACTL and WPDCTL, things seem to work.

 

this might also explain why when i connect with gdb over jtag things work one

time and then crash: gdb over jtag set the PWR bit in those registers and then

kgdb cleared them.  so writing 0 to the PWR bit when it's already 0 looks like

it causes misbehavior.

 

i'll check with design

 

 

 

    Files

    Changes

    Commits

    Dependencies

    Duplicates

    Associations

    Tags

 

File Name     File Type     File Size     Posted By

No Files Were Found

Attachments

    Outcomes