2008-01-25 07:49:41 bfin_mac: RX from DMA to skb payload bug if cache policy is WB
Alexey Demin (RUSSIAN FEDERATION)
Message: 50144
Hi all!
I guess there is a bug in the receiving path of bfin_mac driver when cache policy is WriteBack.
DMA Rx use skb payload area to receive ethernet packets.
When skb has been destroyed with kfree_skb, the kernel can reuse this payload area
for another packet.
If this payload has been changed(for example, ip forwarding with ttl decrement)
the changes can be stored in cache.
When dma transfers data of new packet to the payload area(skb->data) and
the cache of deleted packet with the same payload area flushes its changes simultaneously, the data in payload of new packet can be replaced with data of old packet.
So we must invalidate cache of payload before dma will be able
to send new data to this payload.
Here is the diffs with checking for bug and its fixing.
Checking bug presence:
uncomment #define CHECK_CACHE_BUG
make sure CONFIG_BKFIN_WB cache policy is ON.
ping -i 0 or ping -f from any host to this box.
Fixing bug:
uncomment #define FIX_CACHE_BUG"
With best regards, Lex.
==================================
--- bfin_mac.c&revision=4101 2008-01-21 20:17:20.000000000 +0300
+++ bfin_mac.c&revision=4101_fixed 2008-01-25 14:54:21.000000000 +0300
@@ -636,6 +636,17 @@
return 0;
}
+#define FIX_CACHE_BUG
+#define CHECK_CACHE_BUG
+
+#if defined(FIX_CACHE_BUG) && defined(CONFIG_BLKFIN_WT)
+#error There is nothing to fix when write through (WT) cache policy is ON
+#endif
+
+#ifdef CHECK_CACHE_BUG
+#include <linux/ip.h> /* struct iphdr */
+#endif
+
static void bf537mac_rx(struct net_device *dev)
{
struct sk_buff *skb, *new_skb;
@@ -653,6 +664,14 @@
}
/* reserve 2 bytes for RXDWA padding */
skb_reserve(new_skb, 2);
+
+#ifdef FIX_CACHE_BUG
+ /*Important!!! We must do it to cancel flushing of changes from cache
+ * back to payload filling with new data from DMA
+ */
+ blackfin_dcache_invalidate_range((unsigned long)new_skb->head, (unsigned long)new_skb->end);
+#endif
+
current_rx_ptr->skb = new_skb;
current_rx_ptr->desc_a.start_addr = (unsigned long)new_skb->data - 2;
@@ -669,6 +688,28 @@
skb->ip_summed = CHECKSUM_COMPLETE;
#endif
+#ifdef CHECK_CACHE_BUG
+ if (skb->protocol == htons(ETH_P_IP))
+ {
+ struct iphdr* iph = (struct iphdr*) skb->data;
+
+ printk(KERN_DEBUG "bfin_mac rx_pkt: skb=x%p(x%p-x%p), id=%d, pkt_len=%d, ttl=%d\n",
+ skb,skb->head, skb->end,
+ ntohs(iph->id),
+ len,
+ iph->ttl
+ );
+
+ if (iph->ttl == 55)
+ printk("ooooops, new packet has been corrupted with old data!!!!\n");
+
+ iph->ttl = 55;
+ iph->check = 0;
+ iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
+
+ }
+#endif
+
netif_rx(skb);
lp->stats.rx_packets++;
lp->stats.rx_bytes += len;
=============================
QuoteReplyEditDelete
2008-01-25 11:33:43 Re: bfin_mac: RX from DMA to skb payload bug if cache policy is WB
Bryan Wu (CHINA)
Message: 50159 Thanks a lot, Alexey
I will try you patch, right way. Generally, I think your comment is right.
Regards,
-Bryan
QuoteReplyEditDelete
2008-01-29 06:17:24 Re: bfin_mac: RX from DMA to skb payload bug if cache policy is WB
Bryan Wu (CHINA)
Message: 50271 Hi Alexey,
I caught the cache bug you mentioned. I try to understand the solution and record here:
in the driver:
1. old_skb->data filled with DMA TTL=64, then you modified it to TTL=55. Noticed this TTL=55 is in data cache before write to memory.
2. old_skb send to upper level, finally it was free by kfree_skb. Here in data cache should be different from real memory if not write to memory.
3. bf537mac_rx() call dev_alloc_skb() to get a new_skb. As you point out, this new_skb->data might be the same as old_skb->data.
4. Soon, DMA will fill this new_skb->data with new data TTL=64 (in real memory), while in data cache it is TTL=55.
5. Maybe soon, data cache was written to the memory and overwrite the data, TTL=55.
6. So we got this bug.
With you patch, the new_skb->data cache was invalidated, so there is no further data cache writing operation to memory any more. DMA data will not be overwritten finally.
Thanks, I will apply this to our svn and 08R1 release, then send it to upstream mainline.
One suggestion is maybe this cache invalidate operation should be added to alloc_skb(), other architecture can share this protection.
-Bryan
QuoteReplyEditDelete
2008-01-29 10:44:24 Re: bfin_mac: RX from DMA to skb payload bug if cache policy is WB
Alexey Demin (RUSSIAN FEDERATION)
Message: 50315
Hi, Bryan. You have correctly understood, thanks for attention.
Note that this problem applies only for DMA which does not know about cache existence.
Maybe it is only CPU's feature which not invalidate cache before fill the dma memory?
The kernel code will operate with data either from a cache or from memory,
but not simultaneously and it does not require cache invalidating.
I think this patch must be in driver which uses skb payload buffers for dma transfers.
Regards, Lex.
QuoteReplyEditDelete
2008-01-29 12:29:18 Re: bfin_mac: RX from DMA to skb payload bug if cache policy is WB
Bryan Wu (CHINA)
Message: 50323 Hi Lex,
You have caught a very subtle bug I never met before, Thanks.
But generally, I think this is because skb reuse problem, which is introduced by slab/slub kmem_cache_xxx.
It is pretty common in the kernel world. Imagine other subsystem such as USB urb struct use DMA for data transfer, they also need invalidate the buffer.
So if the buffer is allocated from slab/slub kmem and is for DMA transfer, it should be invalidated before DMA use it, right?
Regards,
-Bryan
QuoteReplyEditDelete
2008-01-29 13:42:19 Re: bfin_mac: RX from DMA to skb payload bug if cache policy is WB
Alexey Demin (RUSSIAN FEDERATION)
Message: 50328
Yes Bryan you are right.
Maybe we must set gfp flag in alloc_skb like GFP_DMA. And slab/slob must invalidate memory area with this flag every allocation?