2008-08-10 07:27:25 bfin_sport driver may corrupt memory if DMA is used?
Michael McTernan (UNITED KINGDOM)
Message: 60210
Hi,
The bfin_sport driver can be configured to use DMA. In which case a read will do:
static ssize_t sport_read(struct file *filp, char __user *buf, size_t count,
loff_t *f_pos)
{
...
if (cfg->dma_enabled) {
...
/* Invalidate the buffer */
invalidate_dcache_range((unsigned long)buf, \
(unsigned long)(buf + count));
I think this will invalidate any 32 byte cache lines over that region using FLUSHINV. Therefore if buf or count are not 32byte aligned and dcaching is enabled, I think there is a small risk that access to adjacent data (i.e. within the same cache line) from some other thread during the transfer could cause the result of the DMA transfer to be missed.
The sport driver page doesn't mention this - have I got this right or misunderstood?
Assuming this is what's going on, what is the best thing to do in such a situation - align the buffers to cache lines from user space (maybe I can add WARN_ON() in bfin_sport to catch alignment errors), or should I be striving to the use kernel DMA pools, somehow passing the buffers to the userspace app?
Regards,
Mike
QuoteReplyEditDelete
2008-08-10 23:20:55 Re: bfin_sport driver may corrupt memory if DMA is used?
Yi Li (CHINA)
Message: 60214
> I think there is a small risk that access to adjacent data (i.e. within the same cache line) from some other thread during the transfer could cause the result of the DMA transfer to be missed.
Sorry, I cannot understand what the risk exists here?
In mach-common/cache.S:
ENTRY(_blackfin_dcache_invalidate_range)
R2 = -L1_CACHE_BYTES;
R2 = R0 & R2;
P0 = R2;
P1 = R1;
CSYNC(R3);
FLUSHINV[P0];
Do you mean the alignment here would cause problems in certain case?
-Yi
QuoteReplyEditDelete
2008-08-11 00:27:22 Re: bfin_sport driver may corrupt memory if DMA is used?
Mike Frysinger (UNITED STATES)
Message: 60217
i imagine his point is if you have a buffer of say 5 bytes followed by 3 bytes for other variables, then using dma on the buffer may cause problems in a threaded application and the last byte of the buffer and the 3 bytes. for example:
struct foo {
char buffer[5];
char state, count, something;
};
this issue isnt limited to the sport driver. anything that uses these cache functions would be affected.
QuoteReplyEditDelete
2008-08-11 00:39:23 Re: bfin_sport driver may corrupt memory if DMA is used?
Robin Getz (UNITED STATES)
Message: 60218
Yi:
I think that the risk that Mike is asking about is:
if the start or end of buf exists on an incomplete cache line - the act of doing the FLUSHINV() - which acts on the entire line (32-bytes) will effect the adjacent data.
Any adjacent data should be OK - if line is in the cache and dirty, FLUSHINV will write the cache line out to the next level of memory in the hierarchy.
This will not effect any data - but could effect the buffer. (I think).
The if the data A and buf are in the same cache line, the FLUSHINV will write things out, and mark the line as invalid. If the data A is used before the DMA transfer is complete - the cache line will be read before the DMA is done, and a bogus buf will be living in the cache...
So, I think the answer is yes - there might be an issue that requires all DMA buffers to be cache line aligned/and consume full lines (32 bytes).
-Robin
QuoteReplyEditDelete
2008-08-11 04:55:20 Re: bfin_sport driver may corrupt memory if DMA is used?
Michael McTernan (UNITED KINGDOM)
Message: 60226
Hi Robin,
Right, Mike F's example is a good illustration of the problem.
> This will not effect any data - but could effect the buffer. (I think).
> The if the data A and buf are in the same cache line, the FLUSHINV
> will write things out, and mark the line as invalid. If the data A is used
> before the DMA transfer is complete - the cache line will be read
> before the DMA is done, and a bogus buf will be living in the cache...
Exactly. And only in the case of a DMA read from the peripheral to a non-aligned buffer with dcache enabled and some other thread making access.
I've discovered this because I've a small list of buffers that are handed off to a thread for DMA reads prior to processing (a producer-consumer arrangement). This allows the application to keep working on new data without blocking while the DMA pulls in the next work unit. Unfortunately some of the buffers are adjacent and not 32-byte aligned, and hence the processing of one buffer during a transfer can cause a subsequent buffer to be corrupted. I'm using write-back dcache, whch is probably the most vulnerable.
Making the buffers all 32-byte aligned (with a bit of padding on the end where needed) indeed appears to fix this problem.
Having confirmed this, I'm not sure what the action should be taken. I'd suggest the sport driver page (http://docs.blackfin.uclinux.org/doku.php?id=sport_driver&s[]=sport) be ammended with something like:
Caveats
If using DMA transfers to read from the SPORT in a system with dcache enabled, it is prudent to ensure that buffers passed to the SPORT driver by read() are aligned to a dcache line of 32 bytes and similarly end or are padded upto a cache line. This ensures that no other data shares a cache line with the start or end of the DMA buffer there by avoiding the risk that access to nearby data during a DMA transfer could cause a cache line to be fetched before the DMA is complete; in such a case the read data may not be seen by the application.
For simple structures or buffers, the gcc aligned attribute can acomplish this:
struct S { short f[64]; } __attribute__ ((aligned (32)));
Locally I've added WARN_ON() in the sport driver write() function to check the alignment (buffer start and end) into the SPORT driver for reads. I would however also suggest that the sport driver be changed to either:
- return an error in the case this problem could occur (e.g. if dcache enabled & a non-aligned write, return -EIO or -EINVAL
- use a non-DMA transfer for bytes upto a cache line, then DMA anything aligned, finally using non-DMA if needed for any tail bytes
The first option is quick to implement, but may break things (although it may also flush out some very hard bugs), while the second is more friendly but adds a hidden performance penalty.
Regards,
Mike
QuoteReplyEditDelete
2008-08-11 06:10:34 Re: bfin_sport driver may corrupt memory if DMA is used?
Yi Li (CHINA)
Message: 60239
I've added a bug to track this issue:
Thanks.
QuoteReplyEditDelete
2008-08-11 16:02:23 Re: bfin_sport driver may corrupt memory if DMA is used?
Mike Frysinger (UNITED STATES)
Message: 60255
i dont think alignment alone will help you. you need both alignment and buffer padding. plus, the alignment needs to be applied to the buffer, not the structure.
i'm not so worried about the sport driver ... i'm worried about all the other places we dont know about. memory dmas will have the same problem which means anything that calls dma_memcpy() ... or any userspace application that uses the memcpy syscall ...
although, this issue applies only in the following scenarios:
- shared memory buffer between processes (which isnt terribly common)
- dma and asynchronous operations (more common, but still not much)
- threaded applications and any dma (which is where it'll probably be seen the most)
QuoteReplyEditDelete
2008-08-13 06:56:16 Re: bfin_sport driver may corrupt memory if DMA is used?
Michael McTernan (UNITED KINGDOM)
Message: 60388
> i'm worried about all the other places we dont know about.
Maybe the CPLB could be used to disable caching over the region covering the buffer for the duration of the transfer? This could cause a performance hit if the DMA destination buffer was small and shared the CPLB entry with other hot data, but would be very safe. DMA transfer for small buffers probably isn't much of a win in anycase.
QuoteReplyEditDelete
2008-08-13 14:46:15 Re: bfin_sport driver may corrupt memory if DMA is used?
Robin Getz (UNITED STATES)
Message: 60422
Michael:
After a little discussion with Mike - we came to the conclusion that any driver (like the SPORT driver), which operates directly on a buffer coming from userspace is asking for badness. It should be using copy[to/from]user() - and fixing this type of thing itself. Since this would add a memcpy in places you don't really want... the best thing is to mmap the device, and the kernel gets to allocate the buffer - that way it is not going to happen.
-Robin