FAQ: missalign data access when doing tcp_transmit_skb (2010-02-09)

Document created by Aaronwu Employee on Aug 28, 2013Last modified by sonic on Sep 8, 2013
Version 2Show Document
  • View in full screen mode

2010-02-09 00:58:34     missalign data access when doing tcp_transmit_skb

Xin Xin (CHINA)

Message: 85784   

 

Hi,

 

     I met a missaligned data access when doing bt download as following

 

ErroLog:

 

[   17.804000] Data access misaligned address violation

[   17.804000]  - Attempted misaligned data memory or data cache access.

[   17.804000] Kernel OOPS in progress

[   17.804000] Deferred Exception context

[   17.804000] CURRENT PROCESS:

[   17.804000] COMM=usleep PID=14789

[   17.804000] CPU = 0

[   17.804000] TEXT = 0x03900000-0x03938c88        DATA = 0x03090c88-0x0309385c

[   17.804000]  BSS = 0x0309385c-0x00800000  USER-STACK = 0x0081fec0

[   17.804000]

[   17.804000] return address: [0x001b6846]; contents of:

[   17.804000] 0x001b6820:  17a8  9110  0040  e51d  001f  e560  00ac  9728

[   17.804000] 0x001b6830:  e560  00a4  b468  a138  c604  4000  c681  83c0

[   17.804000] 0x001b6840:  c681  8040  5608 [b068] e420  00a4  c604  4000

[   17.804000] 0x001b6850:  c681  83c0  c681  8040  5608  b0a8  c682  81f6

[   17.804000]

[   17.804000] ADSP-BF527-0.1 530(MHz CCLK) 132(MHz SCLK) (mpu off)

[   17.804000] Linux version 2.6.28.10-ADI-2009R1-svn100

[   17.804000] Built with gcc version 4.1.2 (ADI svn)

[   17.804000]

[   17.804000] SEQUENCER STATUS:                Not tainted

[   17.804000]  SEQSTAT: 00000024  IPEND: c030  SYSCFG: 0006

[   17.804000]   EXCAUSE   : 0x24

[   17.804000]   interrupts disabled

[   17.804000]   physical IVG5 asserted : <0xffa00b7c> { _evt_ivhw + 0x0 }

[   17.804000]   physical IVG14 asserted : <0xffa00988> { _evt14_softirq + 0x0 }

[   17.804000]   physical IVG15 asserted : <0xffa00cd4> { _evt_system_call + 0x0 }

[   17.804000]   logical irq   6 mapped  : <0xffa00364> { _timer_interrupt + 0x0 }

[   17.804000]   logical irq  21 mapped  : <0x00183f9c> { _bfin_rtc_interrupt + 0x0 }

[   17.804000]   logical irq  22 mapped  : <0x0016a094> { _bf5xx_nand_dma_irq + 0x0 }

[   17.804000]   logical irq  27 mapped  : <0x0018628c> { _bfin_twi_interrupt_entry + 0x0 }

[   17.804000]   logical irq  29 mapped  : <0x00134cac> { _bfin_serial_dma_rx_int + 0x0 }

[   17.804000]   logical irq  30 mapped  : <0x00134ef8> { _bfin_serial_dma_tx_int + 0x0 }

[   17.804000]   logical irq  35 mapped  : <0x001441b4> { _bfin_mac_interrupt + 0x0 }

[   17.804000]   logical irq  59 mapped  : <0x0017de8c> { _blackfin_interrupt + 0x0 }

[   17.804000]  RETE: <0x00000000> /* Maybe null pointer? */

[   17.804000]  RETN: <0x03a37a8c> /* kernel dynamic memory */

[   17.804000]  RETX: <0x00000480> /* Maybe fixed code section */

[   17.804000]  RETS: <0x001b67d4> { _tcp_transmit_skb + 0xa4 }

[   17.804000]  PC  : <0x001b6846> { _tcp_transmit_skb + 0x116 }

[   17.804000] DCPLB_FAULT_ADDR: <0x039cb130> /* kernel dynamic memory */

[   17.804000] ICPLB_FAULT_ADDR: <0x001b6846> { _tcp_transmit_skb + 0x116 }

[   17.804000]

[   17.804000] PROCESSOR STATE:

[   17.804000]  R0 : da0f22b2    R1 : 000f00b2    R2 : 0000ffff    R3 : 00000000

[   17.804000]  R4 : 00000000    R5 : 03a37aa0    R6 : 00000020    R7 : 00000001

[   17.804000]  P0 : 03a37aa0    P1 : 00480c20    P2 : 002f17a8    P3 : 00480c00

[   17.804000]  P4 : 033b5160    P5 : 039cb12e    FP : 00480c20    SP : 03a379b0

[   17.804000]  LB0: 00122c42    LT0: 00122c42    LC0: 00000000

[   17.804000]  LB1: 00008896    LT1: 00008896    LC1: 00000000

[   17.804000]  B0 : 00000000    L0 : 00000000    M0 : 00000000    I0 : 03a37ddc

[   17.804000]  B1 : 00000000    L1 : 00000000    M1 : 00000000    I1 : 00480bb0

[   17.804000]  B2 : 00000000    L2 : 00000000    M2 : 00000000    I2 : 00000000

[   17.804000]  B3 : 00000000    L3 : 00000000    M3 : 00000000    I3 : 00000000

[   17.804000] A0.w: 00000000   A0.x: 00000000   A1.w: 00000000   A1.x: 00000000

[   17.804000] USP : 0081fac0  ASTAT: 02002002

[   17.804000]

[   17.804000] Hardware Trace:

[   17.804000]    0 Target : <0x00004c24> { _trap_c + 0x0 }

[   17.804000]      Source : <0xffa00622> { _exception_to_level5 + 0xae }

[   17.804000]    1 Target : <0xffa00574> { _exception_to_level5 + 0x0 }

[   17.804000]      Source : <0xffa00432> { _bfin_return_from_exception + 0x6 }

[   17.804000]    2 Target : <0xffa0042c> { _bfin_return_from_exception + 0x0 }

[   17.804000]      Source : <0xffa004cc> { _ex_trap_c + 0x6c }

[   17.804000]    3 Target : <0xffa00460> { _ex_trap_c + 0x0 }

[   17.804000]      Source : <0xffa006ae> { _trap + 0x2a }

[   17.804000]    4 Target : <0xffa00684> { _trap + 0x0 }

[   17.804000]      Source : <0x001b6844> { _tcp_transmit_skb + 0x114 } 0x5608

[   17.804000]    5 Target : <0x001b67d4> { _tcp_transmit_skb + 0xa4 }

[   17.804000]      Source : <0x0018c9f0> { _skb_push + 0x2c } RTS

[   17.804000]    6 Target : <0x0018c9c4> { _skb_push + 0x0 }

[   17.804000]      Source : <0x001b67d0> { _tcp_transmit_skb + 0xa0 } CALL pcrel

[   17.804000]    7 Target : <0x001b67cc> { _tcp_transmit_skb + 0x9c }

[   17.804000]      Source : <0x001b6a44> { _tcp_transmit_skb + 0x314 } IF !CC JUMP

[   17.804000]    8 Target : <0x001b6a3a> { _tcp_transmit_skb + 0x30a }

[   17.804000]      Source : <0x001b67ca> { _tcp_transmit_skb + 0x9a } IF CC JUMP

[   17.804000]    9 Target : <0x001b67ac> { _tcp_transmit_skb + 0x7c }

[   17.804000]      Source : <0x001b63e4> { _tcp_established_options + 0x44 } RTS

[   17.804000]   10 Target : <0x001b63a0> { _tcp_established_options + 0x0 }

[   17.804000]      Source : <0x001b67a8> { _tcp_transmit_skb + 0x78 } CALL pcrel

[   17.804000]   11 Target : <0x001b679c> { _tcp_transmit_skb + 0x6c }

[   17.804000]      Source : <0x001b6798> { _tcp_transmit_skb + 0x68 } IF CC JUMP

[   17.804000]   12 Target : <0x001b6780> { _tcp_transmit_skb + 0x50 }

[   17.804000]      Source : <0x001b677c> { _tcp_transmit_skb + 0x4c } IF CC JUMP

[   17.804000]   13 Target : <0x001b6778> { _tcp_transmit_skb + 0x48 }

[   17.804000]      Source : <0x0018c272> { ___skb_clone + 0xd2 } RTS

[   17.804000]   14 Target : <0x0018c1b8> { ___skb_clone + 0x18 }

[   17.804000]      Source : <0x0018c13a> { ___copy_skb_header + 0x10e } RTS

[   17.804000]   15 Target : <0x0018c0bc> { ___copy_skb_header + 0x90 }

[   17.804000]      Source : <0x00122c56> { _memcpy + 0x4e } RTS

[   17.804000]

[   17.804000] Kernel Stack

[   17.804000] Stack info:

[   17.804000]  SP: [0x03a37f24] <0x03a37f24> /* kernel dynamic memory */

[   17.804000]  Memory from 0x03a37f20 to 03a38000

[   17.804000] 03a37f20: 03244188 [00311452] 00008000  00000000  00000000  03a38000  00311452  00311452

[   17.804000] 03a37f40:<003154de> ffa00d38  02002022  00312ea9  038a2367  00312ea8  038a2366  00000000

[   17.804000] 03a37f60: 00000000  00000000  00000000  00000000  00000000  00000000  00000000  00000000

[   17.804000] 03a37f80: 00000000  00000000  00000000  00000000  00000000  00000000  00000000  00000000

[   17.804000] 03a37fa0: 00000000  00000000  00000000  03244340  000005c0  0081fac0  0081fac0  0082c4dc

[   17.804000] 03a37fc0: 039f7154  03244188  03244424  03244420  0000005b  0388a6d7  00000000  03244188

[   17.804000] 03a37fe0: 00000001  0388a6d7  0326d000  000006fe  0326d000  0326d000  0000005b  00000006

[   17.804000] Return addresses in stack:

[   17.804000]     address : <0x003154de> { _bfin_debug_mmrs_init + 0x30a }

[   17.804000] Modules linked in:

[   17.804000] Kernel panic - not syncing: Kernel exception

 

Version:

 

root:/> version

kernel:    Linux release 2.6.28.10-ADI-2009R1-svn100, build #47 Mon Feb 8 16:20:16 CST 2010

toolchain: bfin-linux-uclibc-gcc release gcc version 4.1.2 (ADI svn)

user-dist: release svn-100, build #34 Mon Feb 8 16:21:22 CST 2010

=========================================================================================

 

Objdump ipv4_output.o:

 

    9e0:    20 e6 15 00     [P4 + 0x54] = R0;

     9e4:    07 18           IF CC JUMP 0x9f2 <_tcp_transmit_skb+0xf6>;

     9e6:    4a e1 00 00     P2.H = 0x0;        /* (  0)    P2=0x0 <_tcp_select_initial_window> */

     9ea:    0a e1 00 00     P2.L = 0x0;        /* (  0)    P2=0x0 <_tcp_select_initial_window> */

     9ee:    10 91           R0 = [P2];

     9f0:    40 00           STI R0;

     9f2:    1d e5 1f 00     P5 = [P3 + 0x7c];

     9f6:    60 e5 ac 00     R0 = W[P4 + 0x158] (X);

     9fa:    28 97           W[P5] = R0;

     9fc:    60 e5 a4 00     R0 = W[P4 + 0x148] (X);

     a00:    68 b4           W[P5 + 0x2] = R0;

     a02:    38 a1           R0 = [FP + 0x10];

 

static __inline__ __attribute_const__ __u32 ___arch__swahw32(__u32 xx)

{

    __u32 rv;

    __asm__("%0 = PACK(%1.L, %1.H);\n\t": "=d"(rv): "d"(xx));

     a04:    04 c6 00 40     R0 = PACK (R0.L, R0.H);

     a08:    81 c6 c0 83     R1 = R0 >> 0x8 (V);

     a0c:    81 c6 40 80     R0 = R0 << 0x8 (V);

     a10:    08 56           R0 = R0 | R1;

     a12:    68 b0           [P5 + 0x4] = R0;

     a14:    20 e4 a4 00     R0 = [P4 + 0x290];

     a18:    04 c6 00 40     R0 = PACK (R0.L, R0.H);

     a1c:    81 c6 c0 83     R1 = R0 >> 0x8 (V);

     a20:    81 c6 40 80     R0 = R0 << 0x8 (V);

     a24:    08 56           R0 = R0 | R1;

     a26:    a8 b0           [P5 + 0x8] = R0;

    return rv;

}

 

 

Crash at:

 

a12:    68 b0           [P5 + 0x4] = R0;

 

P5 : 039cb12e isn't aligned to 4 bytes.

 

RE C code:

 

th->seq            = htonl(tcb->seq);

 

The pointer "th" isn't aligned to 4 bytes

 

The "th" comes from "th = tcp_hdr(skb);"

 

means: skb->transport_header address.

 

How does the kernel make the address aligned to 4 bytes?

 

Any advice?

 

Best Regards

 

Xin

 

QuoteReplyEditDelete

 

 

2010-02-09 03:37:30     Re: missalign data access when doing tcp_transmit_skb

Yi Li (CHINA)

Message: 85827   

 

Xin,

 

Are you using 2009R1 release or 2009R1 svn branch head?

 

Is it easy to reproduce your issue? If so, can you tell us how to reproduce?

 

"missaligned data access" may be caused by other issues.

 

So it is necessary to use gdb to debug and trace why "P5" becomes missaligned.

 

-Yi

QuoteReplyEditDelete

 

 

2010-02-09 10:46:44     Re: missalign data access when doing tcp_transmit_skb

Xin Xin (CHINA)

Message: 85842   

 

Hi Yi,

 

   I am using the 2008R1, the SVN number you saw is the souce control in our svn server. This situation can be repeated by bt transmission program after running hours. Other network programs work well. So it is strange. 

 

   This happens at kernel, do you mean to debug it using KGDB?

 

   According the disassemble, the P5 should be  skb->transport_header address. I don't know the transport level align need at network protocol. Do you have advice?

 

Best Regards

 

Xin

QuoteReplyEditDelete

 

 

2010-02-09 21:05:36     Re: missalign data access when doing tcp_transmit_skb

Yi Li (CHINA)

Message: 85864   

 

Xin,

 

My suggestion would be debug the kernel. You may use KGDB, but I prefer to use gdbproxy and gnICE:   docs.blackfin.uclinux.org/doku.php?id=hw:jtag:gnice

 

-Yi

QuoteReplyEditDelete

 

 

2010-02-24 04:54:32     Re: missalign data access when doing tcp_transmit_skb

Xin Xin (CHINA)

Message: 86447   

 

Hi, Yi,

 

       Sorry for late reply.I have modified the "skb->transport_header" assign value method by bits, it works well. But the reset problem remains as   blackfin.uclinux.org/gf/project/uclinux-dist/forum/?_forum_action=ForumMessageBrowse&thread_id=39149&action=ForumBrowse&forum_id=39

 

     I use the KGDB debug the kernel through ethernet, but the watch point can't be set as following:

 

(gdb) watch *(int *)0xFFC00100

 

Hardware watchpoint 1: *(int *) 4290773249

 

(gdb) c

 

continuing.

 

Warning:

 

Could not insert hardware watchpint 1.

 

Could not insert hardware breakpoints:

 

You may have requested too many breakpoints/watchpoints.

 

---------------------------------------------------------------------------------------------------------------------------

 

Does this mean I must use ice to debug the problem?

 

Best Regards

 

Xin

QuoteReplyEditDelete

 

 

2010-02-24 05:05:45     Re: missalign data access when doing tcp_transmit_skb

Xin Xin (CHINA)

Message: 86448   

 

And currently there is a ADDS-HPUSB-ICE. Could it help to do debug?

 

Thanks

QuoteReplyEditDelete

 

 

2010-02-24 10:21:22     Re: missalign data access when doing tcp_transmit_skb

Mike Frysinger (UNITED STATES)

Message: 86465   

 

it is not possible to use hardware watchpoints without an ICE, so it isnt going to work with KGDB.  this is a limitation in the Blackfin hardware itself.

 

QuoteReplyEditDelete

 

 

2010-02-24 10:21:45     Re: missalign data access when doing tcp_transmit_skb

Mike Frysinger (UNITED STATES)

Message: 86466   

 

please read the documentation:

  docs.blackfin.uclinux.org/doku.php?id=hw:jtag#unsupported_devices

QuoteReplyEditDelete

 

 

2012-11-01 04:35:19     Re: missalign data access when doing tcp_transmit_skb

Jon Kowal (GERMANY)

Message: 107575   

 

We were able to track this issue down to the following bug/fix:   patchwork.ozlabs.org/patch/129128/

 

In our setup our blackfin was communicating via TCP with a client with a small receive buffer. After a few hours including lots of TCP traffic and several reboots the crash would occur. Having that test running the crash occured about once every 24 hours.

 

By analysing the object dump we were able to track the crash down to the access of th->source in:

 

         /* Build TCP header and checksum it. */

        th = tcp_hdr(skb);

!CRASH! th->source              = inet->sport;

        th->dest                = inet->dport;

        th->seq                 = htonl(tcb->seq);

        th->ack_seq             = htonl(tp->rcv_nxt);

 

After inserting debug output at the beginning of tcp_transmit_skb() we were able to confirm that in every case of misalignment tcp_transmit_skb() had been called by tcp_retransmit_skb().

 

We backported the patch from   patchwork.ozlabs.org/patch/129128/ to the 2009R1 kernel and are not able to reproduce the bug anymore. I'll attach the patch, in case you're interested. Note: it's a diff from our internal SVN, so don't mind the revision numbers.

 

Best regards,

 

Jon Kowal

DSPECIALISTS GmbH

 

tcp_retransmit_skb_alignment.patch

Outcomes