[#4184] pthread test case crash with CPLB miss error in SMP kernel
Submitted By: Vivi Li
Open Date
2008-06-23 04:06:41 Close Date
2008-11-19 03:21:43
Priority:
Medium Assignee:
Graf Yang
Jie Zhang
Status:
Closed Fixed In Release:
N/A
Found In Release:
N/A Release:
Category:
N/A Board:
EZKIT Lite
Processor:
BF561 Silicon Revision:
Is this bug repeatable?:
Yes Resolution:
Duplicate
Uboot version or rev.:
Toolchain version or rev.:
08r1.5-11
App binary format:
N/A
Summary: pthread test case crash with CPLB miss error in SMP kernel
Details:
pthread test case crash with CPLB miss error.
pthread test script and build script is located at uclinux-dist/testsuites/oprofile.
Source code of pthread test is located at uclinux-dist/user/blkfin-test/pthread_test.
--
root:/> cd bin/
root:/bin> ./ex1
create a succeeded 0D
create btaucce ded a
CaaaaPaaaaLaaaaBaaaa maaaaiss
- Used by the MMU to signal a CPLB miss on a data access.
aaaDefered Exception context
CURRENT PROCESS:
COMM=ex1 PID=98
aTEXT = 0x031c0040-0x031ce640 DATA = 0x031ce644-0x031d4854
BSS = 0x031d4854-0x031d6ee4 USER-STACK = 0x031d7f88
areturn address: [0x031c095c]; contents of:
0x031c0930: a 05b4 e200 321b a 0000 04c3 3228 a e141 031d
a0x031c0940: e101 00fc a e800 002a e428 0066 a 4f20 5008
0x031c0950: 3210 a0d0 e14a ffb0 e10a 0000 [9310] bc55
0x031c0960: e300 a 251e b168 e121 a 0064 3045 5048 a 6002
SEQUENCER STATUS: Not tainted
SEQSTAT: 00000026 IPEND: 0030 SYSCFG: 0036
HWERRCAUSE: 0x0
EXCAUSE : 0x26
a RETE: <0x00000000> /* Maybe null pointer? */
RETN: <0x031be000> /* unknown address */
RETX: <0x031c095c> [ ex1 + 0x91c ]
RETS: <0x031c5164> [ ex1 + 0x5124 ]
a PC : <0x031c095c> [ ex1 + 0x91c ]
DCPLB_FAULT_ADDR: <0xffb00000> /* unknown address */
aICPLB_FAULT_ADDR: <0x031c095c> [ ex1 + 0x91c ]
PROCESSOR STATE:
a R0 : 031e8004 R1 : 031d00fc R2 : 00000f21 R3 : 031ec004
R4 : 031ce6ac R5 : 031c0140 R6 : 00000030 R7 : 00004000
a P0 : 031c0938 P1 : 031ebe24 P2 : ffb00000 P3 : 0000005f
P4 : 00000000 P5 : 031ebe24 FP : 031ebe04 SP : 031bdf24
LB0: 031c6679 LT0: 031c6678 LC0: 00000000
LB1: 031c0f6f LT1: 031c0f6e LC1: 00000000
B0 : 00000000 L0 : 00000000 M0 : 00000000 I0 : 037adeac
B1 : 00000000 L1 : 00000000 M1 : 00000000 I1 : 00000000
B2 : 00000000 L2 : 00000000 M2 : 00000000 I2 : 00000000
B3 : 00000000 L3 : 00000000 M3 : 00000000 I3 : 00000000
A0.w: 00000000 A0.x: 00000000 A1.w: 00000000 A1.x: 00000000
USP : 031ebd5c ASTAT: 02002020
aNo trace since you do not have CONFIG_DEBUG_BFIN_NO_KERN_HWTRACE enabled
Stack from 031bdf04:
a 00001686 00009d54 ff700028a ff700028 ff700024 e7306c08a e3fe0017 2f925f25
031c095c 00000030 00000026 00000000 031be000 031c095c 031c095ca 031c5164
031e8004 02002020 031c0f6fa 031c6679 031c0f6e 031c6678 00000000a 00000000
00000000 00000000 00000000 00000000 00000000 00000000a 00000000 00000000
00000000 00000000 00000000 00000000a 00000000 00000000 00000000a 00000000
00000000 00000000 00000000 037adeac 031ebd5c 031ebe04 031ebe24 00000000
Call Trace:
[<00004000>]a _do_settimeofday+0x10/0xd4
aaaaaaaaaaaaaroot:/bin>
--
Follow-ups
--- Graf Yang 2008-06-23 06:06:30
When this thread run on CoreB, it can't access address 0xffb00000.
--- Graf Yang 2008-06-25 02:43:02
This APP will call libthread which will access CoreA scratchpad. It failed when
run on CoreB.
--- Mike Frysinger 2008-06-25 07:44:50
then our toolchain needs updating
--- Graf Yang 2008-06-25 22:08:40
I think we'd implement following steps in toolchain for access scratchpad
1. detect which CPU we are running on
2. call sched_setaffinity() to bind current thread on this CPU
3. access scratchpad of current CPU
4. call sched_setaffinity() to allow the thread running on any CPU
Are there any suggestions?
--- Mike Frysinger 2008-06-26 00:26:51
part (3) would have to be done with the kernel. the toolchain asks the kernel
"what is the currently valid scratch pad address". this is tracked as
[#2566] already ...
--- Sonic Zhang 2008-07-08 23:37:46
Duplicate of bug [#2566]
--- Jean-Christian de Rivaz 2008-08-11 08:44:48
Can you test if the patch below for the toolchain solve the problem ?
diff --git a/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
b/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
index 00efd23..ce7f565 100644
--- a/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
+++ b/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
@@ -1,4 +1,4 @@
-#define L1_SCRATCH_START 0xFFB00000
+#define L1_SCRATCH_START 0xFEB00010
/* Data that is "mapped" into the process VM at the start of the L1
scratch
memory, so that each process can access it at a fixed address. Used for
--- Jean-Christian de Rivaz 2008-08-11 08:45:08
Can you test if the patch below for the toolchain solve the problem ?
diff --git a/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
b/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
index 00efd23..ce7f565 100644
--- a/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
+++ b/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h
@@ -1,4 +1,4 @@
-#define L1_SCRATCH_START 0xFFB00000
+#define L1_SCRATCH_START 0xFEB00010
/* Data that is "mapped" into the process VM at the start of the L1
scratch
memory, so that each process can access it at a fixed address. Used for
--- Vivi Li 2008-10-20 05:06:49
I don't see CPLB miss error after using the latest toolchain.
But pthread test result is still not right.
Sometimes it will get a correct result, and sometimes it says "CPU =
1" or get some unexpected result.
Bellow is the log for each test case:
--
root:/> ex1
CPU = 1
root:/> ex2
CPU = 1
root:/> ex3
bearchinC PoU t=e n1
r = 4337...
root:/> ex4
keread 4C0P Ul o=a e1
0
Thread 400: allocating buffer at 0x3297148
root:/> ex5
CPU = 1
root:/> ex6
CPU = 1
root:/> ex7
CPU = 1
root:/> ptest
PASS
root:/> ex1
Starting process a
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaacreate a succeeded
aaaaaaaaaaaaaaaaaaaaaaaaaaaa
root:/> (right result)
root:/> ex2
0 --->
1 --->
2 --->
3 --->
4 --->
5 --->
6 --->
C P-U>
= -1
9 --->
10 --->
11 --->
12 --->
13 --->
14 --->
15 --->
root:/> (wrong result)
root:/> ex3
Searching for the number = 192...
CPU = 1 (wrong result)
root:/> ex4
Thread 400: allocated key 0
thrCPdU4 0= a l1c
ng buffer at 0x5c8148
Thread 402: allocating buffer at 0x5c85d0
Thread 402: "Result of first thread"
Thread 402: freeing buffer at 0x5c85d0 (wrong result)
root:/> ex5
0 --->
1 --->
2 --->
3 --->
4 --->
5 --->C
-71 P U -=
>
8 --->
9 --->
10 --->
11 --->
12 --->
13 --->
14 --->
15 ---> (wrong result)
root:/> ex6
count = 0
count = 1
count = 2
count = 3
count = 4
count = 5
count = 6
count = 7
count = 8
count = 9
count = 10
count = 11
(...)
count = 1990
count = 1991
count = 1992
count = 1993
count = 1994
count = 1995
count = 1996
count = 1997
count = 1998
count = 1999 (right result)
root:/> ex7
waiting 0 ms ...
count = 0
waiting 100 ms ...
count = 1
waiting 200 ms ...
count = 2
(...)
count = 18
waiting 1900 ms ...
count = 19 (right result)
root:/> ptest
PASS
root:/>
--
--- Sonic Zhang 2008-10-20 05:36:45
Need to call sched_setaffinity() at the beginning of main().
--- Mike Frysinger 2008-10-20 05:40:55
really we should just fix the smp system already instead of hacking every single
one of our tests with changes that grace/vivi will have to simply remove once
the kernel is fixed
--- Vivi Li 2008-10-22 06:00:05
CPLB miss error happen again.
--
root:/bin> ./ex1^M
Data access CPLB miss^M
- Used by the MMU to signal a CPLB miss on a data access.^M
Deferred Exception context^M
CURRENT PROCESS:^M
COMM=ex1 PID=134^M
CPU = 1^M
TEXT = 0x03380040-0x0338eba0 DATA = 0x0338eba4-0x03394db4^M
BSS = 0x03394db4-0x03397444 USER-STACK = 0x03398f8c^M
^M
return address: [0x03380b8a]; contents of:^M
0x03380b60: 3041 e300 1277 2fd7 05e3 e14a 0339 e800 ^M
0x03380b70: 0073 e10a 6f08 b278 e149 ffb0 9110 e14a ^M
0x03380b80: 0339 e109 0000 e10a 6f0c [9308] 9110 b048 ^M
0x03380b90: e14a 0339 e140 0339 e10a 4844 e100 488c ^M
^M
SEQUENCER STATUS: Not tainted^M
SEQSTAT: 00060026 IPEND: 0030 SYSCFG: 0006^M
EXCAUSE : 0x26^M
RETE: <0x00000000> /* Maybe null pointer? */^M
RETN: <0x032ee000> /* kernel dynamic memory */^M
RETX: <0x00000480> /* Maybe fixed code section */^M
RETS: <0x033852bc> [ ex1 + 0x527c ]^M
PC : <0x03380b8a> [ ex1 + 0xb4a ]^M
DCPLB_FAULT_ADDR: <0xffb00000> /* kernel dynamic memory */^M
ICPLB_FAULT_ADDR: <0x03380b8a> [ ex1 + 0xb4a ]^M
^M
PROCESSOR STATE:^M
R0 : 032ea004 R1 : 032ebfe4 R2 : 00000f00 R3 : 00000001^M
R4 : 03380184 R5 : 00000001 R6 : 03396f08 R7 : 03396f0c^M
P0 : 03380b68 P1 : ffb00000 P2 : 03396f0c P3 : 03396f00^M
P4 : 03394a28 P5 : 03397168 FP : 032ebfb4 SP : 032edf24^M
LB0: 03386739 LT0: 03386738 LC0: 00000000^M
LB1: 004123a5 LT1: 0041239e LC1: 00000000^M
B0 : 00000000 L0 : 00000000 M0 : 00000000 I0 : 03398e6c^M
B1 : 00000000 L1 : 00000000 M1 : 00000000 I1 : 00000000^M
B2 : 00000000 L2 : 00000000 M2 : 00000000 I2 : 00000000^M
B3 : 00000000 L3 : 00000000 M3 : 00000000 I3 : 00000000^M
A0.w: 00000000 A0.x: 00000000 A1.w: 00000000 A1.x: 00000000^M
USP : 032ebde8 ASTAT: 02003025^M
^M
No trace since you do not have CONFIG_DEBUG_BFIN_NO_KERN_HWTRACE enabled^M
^M
Userspace Stack^M
Stack info:^M
SP: [0x032ebde8] <0x032ebde8> [ ex1 + 0x1de8 ]^M
FP: (0x032ebf00)^M
Memory from 0x032ebde0 to 032ec000^M
032ebde0: 005b3ca0 00000000 [00000007] 00000000 0338ec0c 005b3ef0 005b3ca0
032ebe80 ^M
032ebe00: 005c3e04 0001eb42 032587f8 3f3f3f3f 00000000 03380980 03380980
0001ec34 ^M
032ebe20: 005b3ca0 00000007 0378961c 00000000 0000ffbf 0010c934 000042c2
<000010be>^M
032ebe40: 00000007 0001f32a 00000000 0338ec00 005c3e04 0000489e 00004866
032ebf24 ^M
032ebe60: 00000007 00000000 00030001 03380968 037a62bc 037a62bc 00000000
037a6040 ^M
032ebe80: 00000007 00000000 00030001 03380968 037a62bc 037a62bc 00000000
037a6040 ^M
032ebea0: 00017bcc 032ebe24 000000ac ffffff54 00000010 00000000 032ebed0
00000001 ^M
032ebec0: 00000000 00000000 00048404 00047ba2 00010b98 032ebed0
<033a65d8> 00017fcc ^M
032ebee0: 032ebf04 00011236 005b3ca0 0057aaa0 0017e520 005c3e04 0000a6ae
032ea000 ^M
032ebf00:(00000000)<000010be> 00000002 00000030 0338013c 0000fffe
00000001 00000000 ^M
032ebf20: 03380968 03380968 03380968 00060026 00000000 032ec000 00000480
03380968 ^M
032ebf40:<033852bc> 0000a8e0 02002020 03380f99 03386739 03380f98
03386738 00000000 ^M
032ebf60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 ^M
032ebf80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 ^M
032ebfa0: 00000000 00000000 00000000 00000000 03273eac 00000000
<033852bc> 03397168 ^M
032ebfc0: 03394a28 03396f00 03396f0c 03396f08 00000001 03380184 00000003
0338013c ^M
032ebfe0: 0338ec0c 005c4004 00000f21 0339065c 005c0004 00002000 03394d24
032deff4 ^M
032ec000: 03281980 ^M
Return addresses in stack:^M
address : <0x000010be> { _init_post + 0xa6 }^M
address : <0x033a65d8> /* kernel dynamic memory */^M
frame 1 : <0x000010be> { _init_post + 0xa6 }^M
address : <0x033852bc> [ ex1 + 0x527c ]^M
address : <0x033852bc> [ ex1 + 0x527c ]^M
root:/bin>
--
--- Graf Yang 2008-10-22 22:58:57
"CPU = 1" is printed by kernel, the kernel also printed mass dump
message to /var/log/message, but it not display to the console, so you feel
strange.
--- Robin Getz 2008-10-23 07:39:43
I agree with Mike - We should not be changing test cases to make them work on
our version of "SMP".
--- Graf Yang 2008-10-23 22:56:51
This issue is the same with tool-chain bug [#2566].
toolchain/uClibc/libc/sysdeps/linux/bfin/bfin_l1layout.h hardcoded the
L1_SCRATCH_START as 0xFFB00000. Because BF561 CoreB scratchpad is at 0xFF700000,
when pthread write stack info into L1_SCRATCH_START on CoreB will cause CPLB
miss error.
--- Vivi Li 2008-10-24 06:13:30
Now use taskset to force to run on CoreA, the same as in bug [#4540].
It can work now.
Files
Changes
Commits
Dependencies
Duplicates
Associations
Tags
File Name File Type File Size Posted By
No Files Were Found