2008-11-20 07:24:36 How does module relocation work
Michael McTernan (UNITED KINGDOM)
Message: 65591
I've got a problem with a module which crashes on load. Hacking around with the stack back tracing got me some nice traces, and I can see it going wrong at a CALL instruction. This is with MPU protection on, and it seems to reliably fail in a function named ipnet_init:
(gdb) disassemble ipnet_init
Dump of assembler code for function ipnet_init:
0x00c46388 <ipnet_init+0>: [--SP] = (R7:7, P5:5);
0x00c4638a <ipnet_init+2>: LINK 0x20; /* (32) */
0x00c4638e <ipnet_init+6>: R0 = 0x6704 (X); /* R0=0x0x6704 <module_frob_arch_sections+160>(26372) */
0x00c46392 <ipnet_init+10>: [FP -0x4] = R0;
0x00c46394 <ipnet_init+12>: P2.H = 0xc9; /* (201) P2=0x0xc90000 <_end+7971516> */
0x00c46398 <ipnet_init+16>: P2.L = 0x1f28; /* (7976) P2=0x0xc91f28 <ipnet_conf_max_sockets> */
0x00c4639c <ipnet_init+20>: R0 = [P2];
0x00c4639e <ipnet_init+22>: R1 = 0x1 (X); /* R1=0x1( 1) */
0x00c463a0 <ipnet_init+24>: R1 <<= 0x14;
0x00c463a2 <ipnet_init+26>: CC = R0 <= R1;
0x00c463a4 <ipnet_init+28>: IF CC JUMP 0x0xc463ae <ipnet_init+38>;
0x00c463a6 <ipnet_init+30>: R0 = -0x3eb (X); /* R0=0x0xfffffc15(-1003) */
0x00c463aa <ipnet_init+34>: [FP -0x10] = R0;
0x00c463ac <ipnet_init+36>: JUMP.S 0x0xc467b8 <ipnet_init+1072>;
0x00c463ae <ipnet_init+38>: P2.H = 0xc9; /* (201) P2=0x0xc91f28 <ipnet_conf_max_sockets> */
0x00c463b2 <ipnet_init+42>: P2.L = 0x1f28; /* (7976) P2=0x0xc91f28 <ipnet_conf_max_sockets> */
0x00c463b6 <ipnet_init+46>: R0 = [P2];
0x00c463b8 <ipnet_init+48>: R1 = R0 << 0x2;
0x00c463bc <ipnet_init+52>: R0 = [FP -0x4];
0x00c463be <ipnet_init+54>: R0 = R0 + R1;
0x00c463c0 <ipnet_init+56>: [FP -0x4] = R0;
0x00c463c2 <ipnet_init+58>: P2.H = 0xcb; /* (203) P2=0x0xcb1f28 <ipcom_lkm_sys_net_ipv4_neigh_default+76> */
---Type <return> to continue, or q <return> to quit---
0x00c463c6 <ipnet_init+62>: P2.L = 0x4b4; /* (1204) P2=0x0xcb04b4 <ipnet_conf_cache_bufsiz> */
0x00c463ca <ipnet_init+66>: R0 = W[P2] (X);
0x00c463cc <ipnet_init+68>: R1 = R0.L (Z);
0x00c463ce <ipnet_init+70>: R0 = [FP -0x4];
0x00c463d0 <ipnet_init+72>: CALL 0x0x1a02060;
I hacked module.c to print the sections during module load:
-s .text 0xc00000 \
-s .init.text 0x7dce030 \
-s .rodata 0xc88678 \
-s .rodata.str1.4 0xc9c6ec \
-s __ksymtab_strings 0xc9cee0 \
-s __ksymtab 0xc9cf7c \
-s .data 0xcb01d0 \
-s .gnu.linkonce.this_module 0xcb20a0 \
-s .bss 0xcb2220 \
-s .symtab 0xc9cfbc \
-s .strtab 0xca4c8c \
So final CALL looks to be to a crazy address, and the crash trace confirms this as causing the problem with the last jump being from ipnet_init+72 and the PC ending up at 0x1a02060:
Undefined instruction
- May be used to emulate instructions that are not defined for
a particular processor implementation.
Kernel OOPS in progress
Defered Exception context
CURRENT PROCESS:
COMM=insmod PID=31
TEXT = 0x07980040-0x079c5c00 DATA = 0x079c5c04-0x079d2704
BSS = 0x079d2704-0x079d87d4 USER-STACK = 0x079dff74
return address: [0x01a02060]; contents of:
0x01a02040: 3fff ffff ffff ffff ffff bfff ffb7 bfff
0x01a02050: feff ffff ffff ffff ffaf ffff ffff 9f7f
0x01a02060: [ffbf] ffff fffe eff7 ffff fff7 7fbf e7ff
0x01a02070: ffdf ffff fdff ffff f7f7 f7bf bfff fbff
SEQUENCER STATUS: Tainted: P
SEQSTAT: 00000021 IPEND: 8030 SYSCFG: 0006
EXCAUSE : 0x21
physical IVG15 asserted : <0xffa01288> /* unknown address */
logical irq 6 mapped : <0xffa0037c> /* unknown address */
logical irq 18 mapped : <0x000a5dc4> { _bfin_serial_rx_int + 0x0 }
logical irq 19 mapped : <0x000a5ff8> { _bfin_serial_tx_int + 0x0 }
logical irq 24 mapped : <0x000ae360> { _bf537mac_interrupt + 0x0 }
RETE: <0x00000000> /* Maybe null pointer? */
RETN: <0x00bf9da4> /* unknown address */
RETX: <0x01a02060> /* unknown address */
RETS: <0x00c463d4> { :ipnet:_ipnet_init + 0x4c }
PC : <0x01a02060> /* unknown address */
DCPLB_FAULT_ADDR: <0x00bf9968> /* unknown address */
ICPLB_FAULT_ADDR: <0x00092ef4> { _sprintf + 0x0 }
PROCESSOR STATE:
R0 : 00007704 R1 : 00000020 R2 : 06052340 R3 : 00000000
R4 : 07ba0b18 R5 : 00000015 R6 : 00bcb2d0 R7 : 00cb20a0
P0 : 00bf8000 P1 : 00bf8000 P2 : 00cb04b4 P3 : 00cb20a0
P4 : 00bcb0e0 P5 : 00bcb2a4 FP : 00bf9dc4 SP : 00bf9cc8
LB0: ffa01e10 LT0: ffa01e10 LC0: 00000000
LB1: 00092d2c LT1: 00092d22 LC1: 00000000
B0 : 00000000 L0 : 00000000 M0 : 00000000 I0 : 00136880
B1 : 00000000 L1 : 00000000 M1 : 00000000 I1 : 00000000
B2 : 00000000 L2 : 00000000 M2 : 00000000 I2 : fffc39ed
B3 : 00000000 L3 : 00000000 M3 : 00000000 I3 : 00000000
A0.w: 00000000 A0.x: 00000000 A1.w: 00000000 A1.x: 00000000
USP : 079dfe84 ASTAT: 02002060
Hardware Trace:
0 Target : <0xffa00c2c> /* unknown address */
Source : <0xffa0090c> /* unknown address */
1 Target : <0xffa00c2c> /* unknown address */ <- trap
Source : <0x00c463d0> { :ipnet:_ipnet_init + 0x48 }
2 Target : <0x00c463ca> { :ipnet:_ipnet_init + 0x42 }
Source : <0xffa0090c> /* unknown address */
3 Target : <0xffa00c2c> /* unknown address */
Source : <0x00c463c6> { :ipnet:_ipnet_init + 0x3e }
4 Target : <0x00c463ae> { :ipnet:_ipnet_init + 0x26 }
Source : <0x00c463a4> { :ipnet:_ipnet_init + 0x1c }
5 Target : <0x00c4639c> { :ipnet:_ipnet_init + 0x14 }
Source : <0xffa0090c> /* unknown address */
6 Target : <0xffa00c2c> /* unknown address */
Source : <0x00c46398> { :ipnet:_ipnet_init + 0x10 }
7 Target : <0x00c46388> { :ipnet:_ipnet_init + 0x0 }
So I loaded the module debug file into bfin-uclinux-gdb on my PC to have a look at the same function prior:
(gdb) disassemble ipnet_init
Dump of assembler code for function ipnet_init:
0x00046388 <ipnet_init+0>: [--SP] = (R7:7, P5:5);
0x0004638a <ipnet_init+2>: LINK 0x20; /* (32) */
0x0004638e <ipnet_init+6>: R0 = 0x6704 (X); /* R0=0x0x6704 <CAST_set_key+3320>(26372) */
0x00046392 <ipnet_init+10>: [FP -0x4] = R0;
0x00046394 <ipnet_init+12>: P2.H = 0x0; /* ( 0) P2=0x0x0 <buf.1956> */
0x00046398 <ipnet_init+16>: P2.L = 0x0; /* ( 0) P2=0x0x0 <buf.1956> */
0x0004639c <ipnet_init+20>: R0 = [P2];
0x0004639e <ipnet_init+22>: R1 = 0x1 (X); /* R1=0x1( 1) */
0x000463a0 <ipnet_init+24>: R1 <<= 0x14;
0x000463a2 <ipnet_init+26>: CC = R0 <= R1;
0x000463a4 <ipnet_init+28>: IF CC JUMP 0x0x463ae <ipnet_init+38>;
0x000463a6 <ipnet_init+30>: R0 = -0x3eb (X); /* R0=0x0xfffffc15(-1003) */
0x000463aa <ipnet_init+34>: [FP -0x10] = R0;
0x000463ac <ipnet_init+36>: JUMP.S 0x0x467b8 <ipnet_init+1072>;
0x000463ae <ipnet_init+38>: P2.H = 0x0; /* ( 0) P2=0x0x0 <buf.1956> */
0x000463b2 <ipnet_init+42>: P2.L = 0x0; /* ( 0) P2=0x0x0---Type <return> to continue, or q <return> to quit---
<buf.1956> */
0x000463b6 <ipnet_init+46>: R0 = [P2];
0x000463b8 <ipnet_init+48>: R1 = R0 << 0x2;
0x000463bc <ipnet_init+52>: R0 = [FP -0x4];
0x000463be <ipnet_init+54>: R0 = R0 + R1;
0x000463c0 <ipnet_init+56>: [FP -0x4] = R0;
0x000463c2 <ipnet_init+58>: P2.H = 0x0; /* ( 0) P2=0x0x0 <buf.1956> */
0x000463c6 <ipnet_init+62>: P2.L = 0x0; /* ( 0) P2=0x0x0 <buf.1956> */
0x000463ca <ipnet_init+66>: R0 = W[P2] (X);
0x000463cc <ipnet_init+68>: R1 = R0.L (Z);
0x000463ce <ipnet_init+70>: R0 = [FP -0x4];
0x000463d0 <ipnet_init+72>: CALL 0x0x45bd0 <ipnet_kioevent_softirq>;
Here the function looks okay and the call is to the correct unre-located address. So I'm wondering how after module loading the CALL instruction got the wrong address. I'm thinking that relocation must have changed the instruction, but incorrectly, pointing to either a bug in the toolchain, compiler, or quite possibly the way the module is built.
Unfortunately the module is alien and uses it's own build process, but I added the following to it's build:
CFLAGS="-mno-fdpic -mcpu=bf537-0.3 -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-common"
Still no luck. Do I need more?
I also tried enabling the pr_debug lines in apply_relocate_add(), as this is the relocation function that is being called as opposed to apply_relocate(). I get a lot of output, but don't really understand it e.g.
location is cb1514, value is c84654 type is 18
before b90 after c84654
Is location the instruction to be patched? If so, I would expect to see a line patching the CALL at 0xc463d0 - but I don't, maybe the bug, but then I don't know how the address in the CALL statement changed from that observed in gdb on PC to after module load.
Note, there are a bunch of CALL instructions going to this address, so I don't think this is a single corruption of the instruction memory - it's more systematic than that.
So... is there any good info on how Blackfin relocation works, or Linux relocation in general if Blackfin isn't very different? Am I right that it patches the individual instructions and as such will have modified the CALL statement I'm seeing?
Any help gratefully received.
QuoteReplyEditDelete
2008-11-20 08:11:27 Re: How does module relocation work
Bernd Schmidt (GERMANY)
Message: 65597
As a shot in the dark, does the appended patch help?
module-cache.diff
QuoteReplyEditDelete
2008-11-20 08:35:54 Re: How does module relocation work
Michael McTernan (UNITED KINGDOM)
Message: 65602
I'm getting a "File not found" when trying to download the patch - can you repost?
QuoteReplyEditDelete
2008-11-20 08:37:48 Re: How does module relocation work
Mike Frysinger (UNITED STATES)
Message: 65603
try again. files posted do not immediatly show up on the server.
QuoteReplyEditDelete
2008-11-20 09:11:39 Re: How does module relocation work
Michael McTernan (UNITED KINGDOM)
Message: 65608
Got the patch - looked interesting, but it hasn't changed the behaviour I may goto write through cache as an experiment though.
I've got a little more debug now too. The CALL is updated once during relocation, after which the value is set the the incorrect CALL seen in the disassembly. I found the relevant relocation record with bfin-uclinux-objdump:
000463d2 R_pcrel24 ___umodsi3
The address is the instruction +2, hence why my earlier grep failed. The application of the relocation produces the following output:
location is c463d2, value is ffa02060 type is 10
value is 7f6dde48, before e3ff-fc00 after e36d-de48
The module is still going in at .text 0xc00000. I'm just trying to understand this bit of code now to see if it makes sense...
QuoteReplyEditDelete
2008-11-20 09:26:23 Re: How does module relocation work
Bernd Schmidt (GERMANY)
Message: 65609
00463d2 R_pcrel24 ___umodsi3
That shouldn't happen. Modules must be compiled with -mlong-calls.
QuoteReplyEditDelete
2008-11-20 09:35:02 Re: How does module relocation work
Michael McTernan (UNITED KINGDOM)
Message: 65611
Awesome - that does it! The module's now loaded.
I suspected the alien build environment for this module was dodgy - I already added a bunch of stuff but didn't see the -mlong-calls option first time around. Thankyou for pointing this out!
Kind Regards, Mike
QuoteReplyEditDelete
2008-11-20 09:35:50 Re: How does module relocation work
Mike Frysinger (UNITED STATES)
Message: 65612
i wonder if we can detect & reject PCREL relocations to symbols inside of the kernel ...
QuoteReplyEditDelete
2008-11-20 11:39:59 Re: How does module relocation work
Michael McTernan (UNITED KINGDOM)
Message: 65620
Trapping and rejecting the module would sure be nice if it doesn't add much overhead to module loading...
QuoteReplyEditDelete
2008-11-20 19:49:17 Re: How does module relocation work
Robin Getz (UNITED STATES)
Message: 65640
Mike:
It's a good idea - Bernd would need to comment on the practicality.
-Robin