2008-04-17 21:45:26 failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54440 I'm receiving serial data at 57,600 baud into UART1 on the BF537-STAMP running 2008R1, and I'm printf()'ing it to the screen. The amount of received data is around 100 bytes, every 2 seconds, and it's binary data (not ASCII). I configure UART1 like so:
// the following code is for testing out UART 1
// see: http://www.easysw.com/~mike/serial/serial.html
// open the UART1 port
fd = open("/dev/ttyBF1", O_RDWR | O_NOCTTY | O_NDELAY);
if (fd == -1) {
printf("ERROR: could not open UART1\n");
return 1;
}
else
fcntl(fd, F_SETFL, FNDELAY);
// set the UART1 baud rate to 57600
tcgetattr(fd, &uart1options); // get current port settings
cfsetispeed(&uart1options, B57600);
cfsetospeed(&uart1options, B57600); // set in & out baudrates to 57600
uart1options.c_cflag |= (CLOCAL | CREAD); // enable receiver & set local mode
uart1options.c_lflag &= ~(ICANON | ECHO | ECHOE | ISIG); // set raw mode (we want simple binary bytes)
// no parity (8 data bits, 1 stop bit, no parity)
uart1options.c_cflag &= ~PARENB;
uart1options.c_cflag &= ~CSTOPB;
uart1options.c_cflag &= ~CSIZE;
uart1options.c_cflag |= CS8;
uart1options.c_cflag &= ~CRTSCTS; // no hardware flow control
uart1options.c_iflag &= ~(IXON | IXOFF | IXANY); // no software flow control
uart1options.c_oflag &= ~OPOST; // raw output
tcsetattr(fd, TCSANOW, &uart1options); // set the new port options right now
then I sit in an endless loop, displaying these received bytes on the screen, like so:
while (1) {
i = read(fd, uart1inbuff, 100); // i contains # of bytes received from uart1
for (j=0; j<i; j++) {
if (uart1inbuff[j] == 0xa0) printf("\n\n"); // a0 is often a "start of message" indicator
printf("%x ", uart1inbuff[j]); // print the received bytes in hex format
}
}
Not very complicated I'll admit, but it's useful because I can confirm the external module supplying the periodic serial data is operating as expected, as a prelude to integrating it into our system.
HOWEVER... this code runs for around 20 minutes, then the blackfin reboots with the following message:
BUG: failure at include/linux/timer.h:153/add_timer()!
Kernel panic - not syncing: BUG!
icache_enable
U-Boot 1.1.6-svn (ADI-2007R1) (Oct 4 2007 - 12:42:05)
It's quite repeatable - let it run unmolested for 20 to 25 minutes, and the blackfin will reboot.
I took a look at the add_timer function it's referring to; I didn't find it very enlightening:
/**
* add_timer - start a timer
* @timer: the timer to be added
*
* The kernel will do a ->function(->data) callback from the
* timer interrupt at the ->expires point in the future. The
* current time is 'jiffies'.
*
* The timer's ->expires, ->function (and if the handler uses it, ->data)
* fields must be set prior calling this function.
*
* Timers with an ->expires field in the past will be executed in the next
* timer tick.
*/
static inline void add_timer(struct timer_list *timer)
{
BUG_ON(timer_pending(timer));
__mod_timer(timer, timer->expires);
}
I don't even know what might be calling add_timer(), let alone have an idea of why it may be causing the system to restart. Any thoughts or suggestions would be much appreciated. Thanks!
QuoteReplyEditDelete
2008-04-17 22:05:50 Re: failure at include/linux/timer.h:153/add_timer()!
Rob D (UNITED STATES)
Message: 54442 I'm not sure what the problem is... but maybe you could use 'stty' to set up the port and just use 'cat' with 'hexdump' or something?
cat /dev/ttyBF1 | hexdump -C
Good luck finding the solution!
QuoteReplyEditDelete
2008-04-17 22:11:54 Re: failure at include/linux/timer.h:153/add_timer()!
Mike Frysinger (UNITED STATES)
Message: 54443 when you say "on the screen", what are you talking about ? do you have two UARTs or just one ? are you connecting to the board via telnet or a different UART ? what about console ?
QuoteReplyEditDelete
2008-04-17 23:03:15 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54448 Sorry for the lack of clarity on the "screen" front. To clarify:
The BF537-STAMP has 2 UARTS. UART0 is the console, and that's where the printf() statements go to. I serially connect my host/development PC to the board via uart0, which is configured at 115,200 baud. So no, no telnet.
UART1 has the external module connected to it, which every 2 seconds is sending a packet of binary data at 57,600 baud. The code I'm running (provided in the first post) sits in an endless loop, reading those bytes from UART1 and displaying them on the console via printf().
QuoteReplyEditDelete
2008-04-18 01:03:24 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54449 Just to add to the confusion...
I took an idea from Rob D's suggestion. I changed the external module's settings so it output ASCII at 4800 baud (instead of binary at 57600 baud). Then at the blackfin uart0 console I did this:
root:/> stty -F /dev/ttyBF1 ispeed 4800
root:/> cat /dev/ttyBF1
This resulted in a periodic stream of ASCII characters being displayed on the (uart0) console. I left it running, came back an hour later, and found the blackfin had again rebooted, with:
BUG: failure at include/linux/timer.h:153/add_timer()!
Kernel panic - not syncing: BUG!
icache_enable
U-Boot 1.1.6-svn (ADI-2007R1) (Oct 4 2007 - 12:42:05)
Maybe a uart1 driver problem???? I'm a bit baffled.
QuoteReplyEditDelete
2008-04-18 01:33:49 Re: failure at include/linux/timer.h:153/add_timer()!
Mike Frysinger (UNITED STATES)
Message: 54451 i doubt it's an issue specific to any UART ... it looks like there is a race condition when driving multiple UARTs simultaneously
QuoteReplyEditDelete
2008-04-18 12:37:22 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54486 Mike, based on your "race condition" comment I made a minor change to the IRQ handler in uclinux-dist/linux-2.6.x/drivers/serial/bfin_5xx.c
static irqreturn_t bfin_serial_dma_rx_int(int irq, void *dev_id)
{
struct bfin_serial_port *uart = dev_id;
unsigned short irqstat;
spin_lock(&uart->port.lock);
irqstat = get_dma_curr_irqstat(uart->rx_dma_channel);
clear_dma_irqstat(uart->rx_dma_channel);
// spin_unlock(&uart->port.lock); // FVH 18/4/2008
del_timer(&(uart->rx_dma_timer));
uart->rx_dma_timer.expires = jiffies;
add_timer(&(uart->rx_dma_timer));
spin_unlock(&uart->port.lock); // FVH 18/4/2008
return IRQ_HANDLED;
}
I don't pretend to have any understanding (at all :-) of how this UART driver is supposed to function, so I don't know if moving the spin_unlock was an appropriate thing to do or not. However, at the moment the 4800 baud ASCII data test has been running for an hour and a half without failure. I'll let it run some more, then I'll try the 57,600 baud binary data test as well. So far, I'd have to say this does seem to be an improvement. (Hope I haven't jinxed myself by saying that :-)
QuoteReplyEditDelete
2008-04-18 13:00:09 Re: failure at include/linux/timer.h:153/add_timer()!
Robin Getz (UNITED STATES)
Message: 54489 Frank:
BUG_ON is defined in ./asm-generic/bug.h
Just add a dump_bfin_trace_buffer(), so it looks like:
#ifndef HAVE_ARCH_BUG
#define BUG() do { \
dump_bfin_trace_buffer(); \
printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __FUNCTION__); \
panic("BUG!"); \
} while (0)
#endif
and you should get a better trace back.
-Robin
QuoteReplyEditDelete
2008-04-18 17:24:06 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54507 I added the line to BUG as you suggested Robin. The 57,600 baud test failed after almost 2 hours. Here's the result:
Hardware Trace:
0 Target : <0x000043cc> { _dump_bfin_trace_buffer + 0x0 }
Source : <0x0008ede8> { _bfin_serial_rx_dma_timeout + 0x238 }
1 Target : <0x0008ede8> { _bfin_serial_rx_dma_timeout + 0x238 }
Source : <0x0008ed56> { _bfin_serial_rx_dma_timeout + 0x1a6 }
2 Target : <0x0008ed32> { _bfin_serial_rx_dma_timeout + 0x182 }
Source : <0x0001aa28> { _queue_delayed_work + 0x24 }
3 Target : <0x0001aa24> { _queue_delayed_work + 0x20 }
Source : <0x0001a90e> { _queue_delayed_work_on + 0xa6 }
4 Target : <0x0001a906> { _queue_delayed_work_on + 0x9e }
Source : <0x0001460e> { ___mod_timer + 0x76 }
5 Target : <0x000145f2> { ___mod_timer + 0x5a }
Source : <0x000143b2> { _internal_add_timer + 0x2a }
6 Target : <0x00014388> { _internal_add_timer + 0x0 }
Source : <0x000145ee> { ___mod_timer + 0x56 }
7 Target : <0x000145e4> { ___mod_timer + 0x4c }
Source : <0x000145c8> { ___mod_timer + 0x30 }
8 Target : <0x000145b2> { ___mod_timer + 0x1a }
Source : <0x000144a0> { _lock_timer_base + 0x24 }
9 Target : <0x0001447c> { _lock_timer_base + 0x0 }
Source : <0x000145ae> { ___mod_timer + 0x16 }
10 Target : <0x00014598> { ___mod_timer + 0x0 }
Source : <0x0001a902> { _queue_delayed_work_on + 0x9a }
11 Target : <0x0001a8ba> { _queue_delayed_work_on + 0x52 }
Source : <0x0001a1ac> { _wq_per_cpu + 0x8 }
12 Target : <0x0001a1a4> { _wq_per_cpu + 0x0 }
Source : <0x0001a8b6> { _queue_delayed_work_on + 0x4e }
13 Target : <0x0001a8a2> { _queue_delayed_work_on + 0x3a }
Source : <0x0001a896> { _queue_delayed_work_on + 0x2e }
14 Target : <0x0001a868> { _queue_delayed_work_on + 0x0 }
Source : <0x0001aa20> { _queue_delayed_work + 0x1c }
15 Target : <0x0001aa18> { _queue_delayed_work + 0x14 }
Source : <0x0001aa0e> { _queue_delayed_work + 0xa }
BUG: failure at include/linux/timer.h:153/add_timer()!
Kernel panic - not syncing: BUG!
icache_enable
U-Boot 1.1.6-svn (ADI-2007R1) (Oct 4 2007 - 12:42:05)
Edit: I tried wrapping spin_lock / spin_unlock around the contents of bfin_serial_rx_dma_timeout() but it made no difference - system still failed after a couple of hours, resulting in this exact same trace. Help!
QuoteReplyEditDelete
2008-04-20 23:08:29 Re: failure at include/linux/timer.h:153/add_timer()!
Robin Getz (UNITED STATES)
Message: 54577 Frank:
Yeah - it looks exactly like a race condition that Mike thought it was.
I think he opened a bug - have a look at:
for further updates.
My guess is that if you change things from DMA to PIO the problem goes away.
-Robin
QuoteReplyEditDelete
2008-04-21 17:12:44 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54647 Thanks very much Robin / Mike.
Just FYI, I tried going crazy, putting spin_lock / unlock around almost every single function in that driver file - although entertaining, it made no difference. I think at this point we're just going to have to cross our fingers that one of your gurus will be able to fix it.
Thanks again for your help.
QuoteReplyEditDelete
2008-04-21 21:55:34 Re: failure at include/linux/timer.h:153/add_timer()!
Pawel Pastuszak (CANADA)
Message: 54659 Hi Guys,
Just been reading this issue and i got to say is I AM HAVING the same dam problem on my board too.... so Frank if you solve this problem let me known i will do the same.... i just started looking at it...
What i see is that its a driver issue i can test the UART 1 is working perfectly in U-boot but not in uClinux.
Pawel
QuoteReplyEditDelete
2008-04-21 23:09:28 Re: failure at include/linux/timer.h:153/add_timer()!
Robin Getz (UNITED STATES)
Message: 54666 Frank/Pawel:
Did anyone try PIO mode yet?
Thanks
-Robin
QuoteReplyEditDelete
2008-04-22 16:08:31 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54735 I'm trying PIO mode now Robin - I'll let it run overnight & post the result.
QuoteReplyEditDelete
2008-04-22 16:23:07 Re: failure at include/linux/timer.h:153/add_timer()!
Robin Getz (UNITED STATES)
Message: 54736 Frank:
Sonic was able to reproduce the issue, and comited a fix. You may want to check that out.
-Robin
QuoteReplyEditDelete
2008-04-23 00:53:49 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54749 PIO mode (with the old code) ran the 4800 baud test for 8 or 9 hours without failure. Looks like your suspicion about PIO mode being OK was right on the money Robin. I'm trying Sonic's fix (in DMA mode) now.
QuoteReplyEditDelete
2008-04-23 11:56:19 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54791 11 hours on the 4800 baud test and still going strong. Kudos & many thanks to Sonic for the fix.
QuoteReplyEditDelete
2008-04-24 13:38:36 Re: failure at include/linux/timer.h:153/add_timer()!
Pawel Pastuszak (CANADA)
Message: 54880 Hi Guys,
I took the svn trunk version of the serial port and i still getting an Kernel panic message, same one.
Notes:
- I am using the uClinux-dist-2008R1-RC8
- Command that crash : stty -F /dev/ttyBF1
Any ideas?
Pawel
QuoteReplyEditDelete
2008-04-24 13:59:03 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54882 The shell command you listed:
stty -F /dev/ttyBF1
doesn't actually do much - it only prints the current settings of the specified port. If that command is causing you a crash, I'm wondering if you actually have uart1 enabled. If you do a:
ls -l /dev/tty*
do you see ttyBF1 listed there?
QuoteReplyEditDelete
2008-04-24 14:13:45 Re: failure at include/linux/timer.h:153/add_timer()!
Pawel Pastuszak (CANADA)
Message: 54884 Yes i do see it and i known this command is simple but some time simple is better, I just want to test if there is communication first before run my app...
QuoteReplyEditDelete
2008-04-24 14:42:52 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54888 Some more silly questions then:
1) Does stty work on ttyBF1?
2) What's the output from: uname -a ?
3) You said in your previous email:
I took the svn trunk version of the serial port...
I am using the uClinux-dist-2008R1-RC8
These statements sound a little puzzling; SVN trunk is different to the 2008R1 branch. Are you mixing & matching code from both source trees?
4) Just FYI, I've never had a problem with the stty command. The bug Sonic just fixed was for when data was being handled by both serial ports simultaneously. I'm suspecting your problem is entirely different. Can you post the output when stty crashes?
QuoteReplyEditDelete
2008-04-24 15:38:05 Re: failure at include/linux/timer.h:153/add_timer()!
Mike Frysinger (UNITED STATES)
Message: 54889 please post the actual crash instead of saying "i'm getting a crash"
QuoteReplyEditDelete
2008-04-24 16:34:11 Re: failure at include/linux/timer.h:153/add_timer()!
Pawel Pastuszak (CANADA)
Message: 54903 Hi Frank,
1) Does stty work on ttyBF1?
no, but on other board it dose
2)
root:~> uname -a
Linux VPM 2.6.22.18-ADI-2008R1astfin-svn #4 Thu Apr 24 13:32:07 EDT 2008 blackfin unknown
root:~>
root:~>
root:~> stty -F /dev/ttyBF1
BUG: failure at kernel/timer.c:467/mod_timer()!
Kernel panic - not syncing: BUG!
3) And yes i am mixing from trunk with R1-RC8
Frank you seem to have the same proble do you think you could send me your serial driver..... if you fixed it ....
Thanks
Pawel
QuoteReplyEditDelete
2008-04-24 20:55:15 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 54913 Looks like you need to work out why it's working on one of your boards, but not the other. Obviously there's a difference between them, in software, or hardware, or both.
I'm using 2008R1 as-is, from the SVN branch, with no trunk code or anything else patched in. My serial driver is not edited in any way - it's the latest one you get when you do a checkout of the 2008R1 branch.
QuoteReplyEditDelete
2008-07-31 03:46:14 Re: failure at include/linux/timer.h:153/add_timer()!
DAVID ZHOU (CHINA)
Message: 59632
hi all,
This bug have been fixed. But now my linux is 2008R1 RC8, how should I do if I want to fix this bug without using SVN version? Any one please help me ! Thanks very much!
QuoteReplyEditDelete
2008-08-01 15:53:56 Re: failure at include/linux/timer.h:153/add_timer()!
Frank Van Hooft (CANADA)
Message: 59767
This bug IS fixed in 2008R1.