2011-05-04 12:40:13 bfin_mac: <unknown>: hw csum failure
Stefan Wanja (GERMANY)
Message: 100424
Hello,
I can see an easy to reproduce problem on bf537-ezkit lite (running 2010 Release) and on bf527 custom board (running 2009 Release). The problem is when the network interface is activated after boot or after plugging a cable in and at the same time packets are coming in I get an error message like this:
PHY: 0:01 - Link is Up - 100/Full
<unknown>: hw csum failure.
Hardware Trace:
0 Target : <0x00115e78> { _dump_stack + 0x0 }
Source : <0x000d1926> { _netdev_rx_csum_fault + 0x36 } JUMP.L
1 Target : <0x000d1920> { _netdev_rx_csum_fault + 0x30 }
Source : <0x00115f7c> { _printk + 0x14 } RTS
2 Target : <0x00115f78> { _printk + 0x10 }
Source : <0x00011eea> { _vprintk + 0x16a } RTS
3 Target : <0x00011ede> { _vprintk + 0x15e }
Source : <0xffa00cf8> { __common_int_entry + 0xcc } RTI
4 Target : <0xffa00c96> { __common_int_entry + 0x6a }
Source : <0xffa00ae0> { _return_from_int + 0x58 } RTS
5 Target : <0xffa00ae0> { _return_from_int + 0x58 }
Source : <0xffa00ab6> { _return_from_int + 0x2e } IF !CC JUMP pcrel
6 Target : <0xffa00a88> { _return_from_int + 0x0 }
Source : <0xffa00c92> { __common_int_entry + 0x66 } JUMP.L
7 Target : <0xffa00c90> { __common_int_entry + 0x64 }
Source : <0xffa0038e> { _asm_do_IRQ + 0x6a } RTS
8 Target : <0xffa00386> { _asm_do_IRQ + 0x62 }
Source : <0x00015536> { __local_bh_enable + 0x3a } RTS
9 Target : <0x000154fc> { __local_bh_enable + 0x0 }
Source : <0x00015b7c> { ___do_softirq + 0xa4 } JUMP.L
10 Target : <0x00015b74> { ___do_softirq + 0x9c }
Source : <0x00015b68> { ___do_softirq + 0x90 } IF CC JUMP pcrel
11 Target : <0x00015b5a> { ___do_softirq + 0x82 }
Source : <0x000340e4> { _rcu_bh_qs + 0x14 } RTS
12 Target : <0x000340d0> { _rcu_bh_qs + 0x0 }
Source : <0x00015b56> { ___do_softirq + 0x7e } JUMP.L
13 Target : <0x00015b4e> { ___do_softirq + 0x76 }
Source : <0x000d1388> { _net_rx_action + 0x7c } RTS
14 Target : <0x000d136e> { _net_rx_action + 0x62 }
Source : <0x000d1396> { _net_rx_action + 0x8a } IF CC JUMP pcrel (BP)
15 Target : <0x000d1392> { _net_rx_action + 0x86 }
Source : <0x000d1216> { _process_backlog + 0x8e } RTS
One way to reproduce this is start the board, start "iperf -s -u" on another computer, then start "iperf -c <server-ip> -u -b 100M -i 30 -d -t 99999" and then unplug the cable and plug it in again or just reset the board and wait for reactivation of the network interface.
Most of the times the error messages are bloating out as long as there are packets coming in but sometimes the system even seems to hang. Please see complete Logs in the attached files.
I could also observe, that uboot fails in fetching a DHCP given IP, gets a complete nonsense IP or gets one but the TFTP transmission has fails or times out when the iperf packets of the iperf server are sent to the device at the same time.
Should a bug be filed in the tracker?
Kind Regards,
Stefan
csum_error.txt
csum_error_with_hang.txt
QuoteReplyEditDelete
2011-05-06 02:09:25 Re: bfin_mac: <unknown>: hw csum failure
Sonic Zhang (CHINA)
Message: 100451
Can't replicate.
2010R1-RC5 Linux on bf537-stamp IP 10.100.4.50
Hopt Linux PC IP 10.100.4.174
As client:
root:/> iperf -c 10.100.4.174 -u -b 100M -i 30 -d -t 99999
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size: 104 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.100.4.174, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 104 KByte (default)
------------------------------------------------------------
[ 5] local 10.100.4.50 port 38320 connected with 10.100.4.174 port 5001
[ 6] local 10.100.4.50 port 5001 connected with 10.100.4.174 port 42851
PHY: 0:01 - Link is Down
PHY: 0:01 - Link is Up - 100/Full
PHY: 0:01 - Link is Down
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-30.0 sec 151 MBytes 42.3 Mbits/sec
PHY: 0:01 - Link is Up - 100/Full
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 6] 0.0-30.0 sec 201 MBytes 56.1 Mbits/sec 0.146 ms 86196/229327 (38%)
PHY: 0:01 - Link is Down
PHY: 0:01 - Link is Up - 100/Full
[ ID] Interval Transfer Bandwidth
[ 5] 30.0-60.0 sec 187 MBytes 52.2 Mbits/sec
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 6] 30.0-60.0 sec 165 MBytes 46.2 Mbits/sec 0.106 ms 150291/268115 (56%)
^C[ ID] Interval Transfer Bandwidth
[ 5] 0.0-87.9 sec 414 MBytes 39.5 Mbits/sec
[ 5] Sent 295119 datagrams
[ 5] Server Report:
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 5] 0.0-87.9 sec 185 MBytes 17.7 Mbits/sec 0.024 ms 162981/295118 (55%)
[ 5] 0.0-87.9 sec 1 datagrams received out-of-order
As server:
root:/> iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size: 104 KByte (default)
------------------------------------------------------------
[ 5] local 10.100.4.50 port 5001 connected with 10.100.4.174 port 42851
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 5] 0.0- 4.6 sec 52.0 MBytes 95.0 Mbits/sec 0.029 ms 992590/1029651 (96%)
[ 5] 0.0- 4.6 sec 1 datagrams received out-of-order
[ 6] local 10.100.4.50 port 5001 connected with 10.100.4.174 port 43382
------------------------------------------------------------
Client connecting to 10.100.4.174, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 104 KByte (default)
------------------------------------------------------------
[ 7] local 10.100.4.50 port 53282 connected with 10.100.4.174 port 5001
PHY: 0:01 - Link is Down
PHY: 0:01 - Link is Up - 100/Full
PHY: 0:01 - Link is Down
PHY: 0:01 - Link is Up - 100/Full
PHY: 0:01 - Link is Down
PHY: 0:01 - Link is Up - 100/Full
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 6] 0.0-47.5 sec 276 MBytes 48.7 Mbits/sec 0.168 ms 196873/393680 (50%)
[ 6] 0.0-47.5 sec 1 datagrams received out-of-order
write2 failed: Connection refused
[ ID] Interval Transfer Bandwidth
[ 7] 0.0-48.4 sec 266 MBytes 46.0 Mbits/sec
[ 7] Sent 189454 datagrams
read failed: Connection refused
[ 7] WARNING: did not receive ack of last datagram after 2 tries.
QuoteReplyEditDelete
2011-05-06 08:34:22 Re: bfin_mac: <unknown>: hw csum failure
Stefan Wanja (GERMANY)
Message: 100486
Hello Sonic,
maybe it makes a difference if the device is directly connected to the pc or via a switch. I think when directly connected the PC itself doesn't send data until the link is completely established, maybe a switch is "faster" with outputting the first packets... In my previous post the connection was done via a Gigabit-Switch. When connect a Gigabit Laptop it also occures, but not as often.
Maybe the cache mode makes a difference. We are using WriteThrough Cache. Kernel config is attached. I'll check to see if I can reproduce it with WRITE BACK also.
Attached is also the binary kernel with filesystem attached. Maybe you can check with this?
Kind Regards,
Stefan
uImage.gz.initramfs
config
QuoteReplyEditDelete
2011-05-09 02:20:28 Re: bfin_mac: <unknown>: hw csum failure
Sonic Zhang (CHINA)
Message: 100503
We have no Gigabit switch to run your test. Please update your result with default 2010R1-RC5 kernel config (WB enabled).
QuoteReplyEditDelete
2011-05-09 12:28:15 Re: bfin_mac: <unknown>: hw csum failure
Stefan Wanja (GERMANY)
Message: 100520
Hey Sonic,
with the default kernel config (WRITE BACK) the problem disappears. Seems to depend on WRITE THROUGH.
So, could you please try the uploaded kernel+attached system (WRITE THROUGH) to reproduce the problem on your side?
Kind Regards,
Stefan
QuoteReplyEditDelete
2011-05-10 00:18:38 Re: bfin_mac: <unknown>: hw csum failure
Sonic Zhang (CHINA)
Message: 100531
Hardware checksum feature is only enabled when WB cache is disabled. On bf537 v0.2, hardware checksum works as exptected. No crash is perceived. But, on bf537 v0.3, hardware checksum generates wrong results. So, netdev_rx_csum_fault() dumps error information.
Current walk around is to disable hardware checksum in bfin_mac.h for bf537 v0.3.
QuoteReplyEditDelete
2011-05-11 05:30:10 Re: bfin_mac: <unknown>: hw csum failure
Stefan Wanja (GERMANY)
Message: 100571
Hello again,
please consider also, that it also happens on our bf527 0.2 board...
QuoteReplyEditDelete
2011-05-11 05:56:18 Re: bfin_mac: <unknown>: hw csum failure
Sonic Zhang (CHINA)
Message: 100572
It seems this issue only occurs on new version of chips.