[#5716] network fails with CONFIG_BFIN_EXTMEM_WRITEBACK

Document created by Aaronwu Employee on Sep 5, 2013
Version 1Show Document
  • View in full screen mode

[#5716] network fails with CONFIG_BFIN_EXTMEM_WRITEBACK

Submitted By: Peter Meerwald

Open Date

2009-11-22 18:11:59     Close Date

2010-06-11 04:53:54

Priority:

Medium     Assignee:

Sonic Zhang

Status:

Closed     Fixed In Release:

N/A

Found In Release:

snaps     Release:

Category:

Kernel Functions     Board:

STAMP

Processor:

BF536     Silicon Revision:

Is this bug repeatable?:

Yes     Resolution:

Fixed

Uboot version or rev.:

    Toolchain version or rev.:

09r1

App binary format:

N/A     

Summary: network fails with CONFIG_BFIN_EXTMEM_WRITEBACK

Details:

 

latest git kernel, kernel error is not always the same; works with writethrough policy

 

TCP cubic registered

NET: Registered protocol family 17

RPC: Registered udp transport module.

RPC: Registered tcp transport module.

IP-Config: Complete:

     device=eth0, addr=172.20.9.66, mask=255.255.255.0, gw=172.20.9.1,

     host=et908a, domain=, nis-domain=(none),

     bootserver=172.20.9.1, rootserver=172.20.9.21, rootpath=

Looking up port of RPC 100003/3 on 172.20.9.21

Looking up port of RPC 100005/3 on 172.20.9.21

 

VFS: Mounted root (nfs filesystem) on device 0:12.

Freeing unused kernel memory: 84k freed

skb_over_panic: text:000c1f34 len:65535 put:65535 head:0093e000 data:0093e022 tail:0x94e021 end:0x93e660 dev:<NULL>

------------[ cut here ]------------

Kernel BUG at 000f57a2 [verbose debug info unavailable]

Kernel panic - not syncing: BUG()

 

Follow-ups

 

--- Sonic Zhang                                              2009-11-24 04:15:28

What blackfin Linux release do you use? How to replicate your failure?

Please give a detail steps or an example code, otherwise we can do nothing.

 

--- Peter Meerwald                                           2009-11-24 10:11:04

Hello Sonic,

 

I'm using latest git kernel; board is custom, similar to bf537-stamp, CPU is

bf536 rev 3

 

I have to correct my ealier statemant that the issue only shows up with

CONFIG_BFIN_EXTMEM_WRITEBACK, it also happens with WRITETHROUGH (perhaps less

frequent)

 

I have investigated a bit further and think to have discovered the root of the

problem:

 

have a look at bfin_mac_rx() in drivers/net/bfin_mac.c

 

lines ~1040 following:

 

len = (unsigned short)((current_rx_ptr->status.status_word) &

RX_FRLEN);

/* Deduce Ethernet FCS length from Ethernet payload length */

len -= ETH_FCS_LENGTH;

skb_put(skb, len);

 

the code gets the frame length from the RX status word; note that no error

checking is performed, reduces it by ETH_FCS_LENGTH (== 4) and passes the value

to skb_put(), skb_put() raises skb_over_panic if the skb tail > end;

 

in case of an error condition (which would show up in the higher bits of the RX

status word, but they are not checked), the frame length might be invalid (the

hardware ref. manual doesn't say if the frame length is valid in case of error,

IMHO)

 

on my hardware I get sporadically the following values in RX status word:

0x81041003, 0x81041001 (note that the frame size flag is set, 0x4000)

the frame length (?) is 1 or 3

 

if we subtract ETH_FCS_LENGTH from this invalid frame length we get a negative

results and skb_put() fails with skb_over_panic()

 

I think bfin_mac_rx() should do proper error checking on the RX status word and

drop the frame:

 

    if (current_rx_ptr->status.status_word & 0x40000) {

        printk(KERN_NOTICE DRV_NAME

               ": rx: invalid frame - packet dropped\n");

        dev->stats.rx_dropped++;

    current_rx_ptr->status.status_word = 0x00000000;

    current_rx_ptr = current_rx_ptr->next;

 

        goto out;

    }

 

probably the check of the status_word should be a bit smarter...

 

further, I think the following code in bfin_mac_rx() is incorrect:

 

    if (!new_skb) {

        printk(KERN_NOTICE DRV_NAME

               ": rx: low on mem - packet dropped\n");

        dev->stats.rx_dropped++;

        goto out;

    }

 

if no memory is available, the status_word should be set to zero and the

current_rx_ptr should be advanced to the next, such as:

 

    current_rx_ptr->status.status_word = 0x00000000;

    current_rx_ptr = current_rx_ptr->next;

 

the current code should leads to a loop between the interrupt handler

bfin_mac_interrupt() and bfin_mac_rx()

 

 

I do not have a good explanation why 2.6.31-latest fails on my board but

2.6.30.4 works; nevertheless, the code in bfin_mac_rx() should be improved

 

please let me know if you have further questions; I'd appreciate any comments

on my observations; I have attached a patch which fixes the problem for me (not

sure if that's generally usable)

 

regards, p.

 

 

--- Sonic Zhang                                              2009-11-25 05:42:38

Applied. Thanks.

 

 

 

    Files

    Changes

    Commits

    Dependencies

    Duplicates

    Associations

    Tags

 

File Name     File Type     File Size     Posted By

invalid_frame.patch    text/x-patch    1152    Peter Meerwald

Attachments

Outcomes