[#5447] stack overflow check fail in mpd on bf533-stamp

Document created by Aaronwu Employee on Sep 4, 2013
Version 1Show Document
  • View in full screen mode

[#5447] stack overflow check fail in mpd on bf533-stamp

Submitted By: Barry Song

Open Date

2009-08-17 18:45:29    

Priority:

Low     Assignee:

Nobody

Status:

Open     Fixed In Release:

N/A

Found In Release:

2009R1-RC6     Release:

2009R1

Category:

N/A     Board:

STAMP

Processor:

BF533     Silicon Revision:

Is this bug repeatable?:

Yes     Resolution:

Assigned (Not Start)

Uboot version or rev.:

    Toolchain version or rev.:

2009R1-RC9

App binary format:

FLAT     

Summary: stack overflow check fail in mpd on bf533-stamp

Details:

 

If use CFLAGS += -fstack-limit-symbol=_stack_start in mpd Makefile, run mpd:

root:/bin> ./mpd

BINFMT_FLAT: reloc outside program 0xbae04 (0 - 0xb0030/0x5f4a0), killing mpd!

SIGSEGV

 

 

 

If use -mstack-check-l1, stack check will always fail while clone a new thread, for example:

Application stack overflow

- Please increase the stack size of the application using elf2flt -s option,

   and/or reduce the stack use of the application.

Deferred Exception context

CURRENT PROCESS:

COMM=mpd PID=202

CPU = 0

TEXT = 0x00300040-0x0035fde0        DATA = 0x0035fe00-0x003aae4c

BSS = 0x003aae4c-0x003b0970  USER-STACK = 0x003cccb8

 

return address: [0x0030a2ec]; contents of:

0x0030a2c0:  9310  e300  5c87  6020  6001  e300  2cdf  e801

0x0030a2d0:  0000  0484  e200  5c44  e14a  ffb0  e10a  0000

0x0030a2e0:  9152  6de2  6dc2  08d6  1402  00a3 [05e3] e800

0x0030a2f0:  0014  e300  5c35  6020  e300  2c70  e14c  003a

 

ADSP-BF533-0.3 398(MHz CCLK) 79(MHz SCLK) (mpu off)

Linux version 2.6.28.10-ADI-2009R1-svn7165

Built with gcc version 4.1.2 (ADI svn)

 

SEQUENCER STATUS:        Not tainted

SEQSTAT: 00000003  IPEND: 0030  SYSCFG: 0006

  EXCAUSE   : 0x3

  interrupts disabled

  physical IVG5 asserted : <0xffa00c5c> { _evt_ivhw + 0x0 }

RETE: <0x00000000> /* Maybe null pointer? */

RETN: <0x00ede000> /* kernel dynamic memory */

RETX: <0x00000480> /* Maybe fixed code section */

RETS: <0x0035610c> [ mpd + 0x560cc ]

PC  : <0x0030a2ec> [ mpd + 0xa2ac ]

DCPLB_FAULT_ADDR: <0xffb00000> /* kernel dynamic memory */

ICPLB_FAULT_ADDR: <0x0030a2ec> [ mpd + 0xa2ac ]

 

PROCESSOR STATE:

R0 : 00000000    R1 : 00292aac    R2 : 00000711    R3 : 00292b04

R4 : 00000001    R5 : 0035fdb4    R6 : 00000001    R7 : 00000000

P0 : 0030a2d8    P1 : 00000000    P2 : 003b0d37    P3 : 003cccbc

P4 : 003aafc8    P5 : 00389380    FP : 00000000    SP : 00eddf24

LB0: 0035a191    LT0: 0035a190    LC0: 00000000

LB1: 00317a3b    LT1: 00317a3a    LC1: 00000000

B0 : 00000000    L0 : 00000000    M0 : 00000000    I0 : 003ccb5c

B1 : 00000000    L1 : 00000000    M1 : 00000000    I1 : 003cb7d4

B2 : 00000000    L2 : 00000000    M2 : 00000000    I2 : 00000000

B3 : 00000000    L3 : 00000000    M3 : 00000000    I3 : 00000000

A0.w: 00000000   A0.x: 00000000   A1.w: 00000000   A1.x: 00000000

USP : 00292aa0  ASTAT: 02003025

 

Hardware Trace:

   0 Target : <0x00004cec> { _trap_c + 0x0 }

     Source : <0xffa0069e> { _exception_to_level5 + 0xae } CALL pcrel

   1 Target : <0xffa005f0> { _exception_to_level5 + 0x0 }

     Source : <0xffa004ac> { _bfin_return_from_exception + 0x20 } RTX

   2 Target : <0xffa0048c> { _bfin_return_from_exception + 0x0 }

     Source : <0xffa00548> { _ex_trap_c + 0x6c } JUMP.S

   3 Target : <0xffa004dc> { _ex_trap_c + 0x0 }

     Source : <0xffa00778> { _trap + 0x68 } JUMP (P4)

   4 Target : <0xffa00730> { _trap + 0x20 }

     Source : <0xffa0072c> { _trap + 0x1c } IF !CC JUMP

   5 Target : <0xffa00710> { _trap + 0x0 }

     Source : <0x0030a2ea> [ mpd + 0xa2aa ] EXCPT 0x3

   6 Target : <0x0030a2d8> [ mpd + 0xa298 ]

     Source : <0x0035610a> [ mpd + 0x560ca ] CALL (P0)

   7 Target : <0x00356100> [ mpd + 0x560c0 ]

     Source : <0xffa00070> { _ret_from_fork + 0x70 } RTI

   8 Target : <0xffa00006> { _ret_from_fork + 0x6 }

     Source : <0x0000d924> { _schedule_tail + 0x54 } RTS

   9 Target : <0x0000d91e> { _schedule_tail + 0x4e }

     Source : <0x0000d8f6> { _schedule_tail + 0x26 } IF !CC JUMP

  10 Target : <0x0000d8e4> { _schedule_tail + 0x14 }

     Source : <0x0000d206> { _finish_task_switch + 0x52 } RTS

  11 Target : <0x0000d1f8> { _finish_task_switch + 0x44 }

     Source : <0x0000d1d8> { _finish_task_switch + 0x24 } IF !CC JUMP

  12 Target : <0x0000d1b4> { _finish_task_switch + 0x0 }

     Source : <0x0000d8e0> { _schedule_tail + 0x10 } CALL pcrel

  13 Target : <0x0000d8d0> { _schedule_tail + 0x0 }

     Source : <0xffa00002> { _ret_from_fork + 0x2 } CALL pcrel

  14 Target : <0xffa00000> { _ret_from_fork + 0x0 }

     Source : <0xffa00972> { _resume + 0x2e } JUMP (P0)

  15 Target : <0xffa00944> { _resume + 0x0 }

     Source : <0xffa01b64> { _schedule + 0x188 } CALL pcrel

Userspace Stack

 

 

 

I check the asm codes near 0x560c0:

000560a8 <_clone>:

   560a8:       10 32           P2 = R0;

   560aa:       00 e8 03 00     LINK 0xc;               /* (12) */

   560ae:       02 30           R0 = R2;

   560b0:       79 ad           P1 = [FP + 0x14];

   560b2:       42 0c           CC = P2 == 0x0;

   560b4:       12 18           IF CC JUMP 0x560d8 <_clone+0x30>;

   560b6:       01 0c           CC = R1 == 0x0;

   560b8:       10 18           IF CC JUMP 0x560d8 <_clone+0x30>;

   560ba:       28 e1 78 00     P0 = 0x78 (X);          /*              P0=0x78(120) */

   560be:       a0 00           EXCPT 0x0;

   560c0:       00 0c           CC = R0 == 0x0;

   560c2:       07 10           IF !CC JUMP 0x560d0 <_clone+0x28>;

   560c4:       41 30           R0 = P1;

   560c6:       42 32           P0 = P2;

 

and near 0xa298:

0000a298 <_filebuf_task>:

    a298:       4a e1 b0 ff     P2.H = 0xffb0;          /* (-80)        P2=0xffb0af20 */

    a29c:       0a e1 00 00     P2.L = 0x0;             /* (  0)        P2=0xffb00000 */

    a2a0:       52 91           P2 = [P2];

    a2a2:       e2 6d           P2 += 0x3c;             /* ( 60) */

    a2a4:       c2 6d           P2 += 0x38;             /* ( 56) */

    a2a6:       d6 08           CC = SP < P2;

    a2a8:       02 14           IF !CC JUMP 0xa2ac <_filebuf_task+0x14> (BP);

    ...

 

In C codes, filebuf_task is the entry of cloned thread:

int filebuf_task (void *nothing)

{

        unblockSignals();

        mpm_enter(MPM_FILEBUF);

        finishSigHandlers();

        nice(19);

           

        while (1) {

                read_lock = 1;

        ...

}

It will be created by:

        pid = clone( mpm_tasks[task_id].func,

                     mpm_tasks[task_id].stack

                       + (mpm_tasks[task_id].stack_size - STACK_HEAD_OFFSET),

                     clone_flags,

                     arg);

 

#define clone_flags (CLONE_VM|CLONE_FS|CLONE_FILES|SIGCHLD)

 

 

Follow-ups

 

--- Barry Song                                               2009-08-18 02:43:57

And mpm_tasks[task_id].stack is dynamically alloced in parent process by:

        mpm_tasks[task_id].stack = malloc(mpm_tasks[task_id].stack_size);

        memset(mpm_tasks[task_id].stack,'\0',mpm_tasks[task_id].stack_size);

 

 

--- Bernd Schmidt                                            2009-08-18 10:00:52

The latter we can't do anything about.  Normally, libpthread tries to ensure

that stack checking continues to work after creating a thread, but if the

application calls clone directly, stack checking can never work for it.

 

The former may be due to Mike's recent change in binfmt_flat to remove one of

our local commits.  Do we still recommend -fstack-limit-symbol somewhere?  If

so, that should probably be changed to -mstack-check-l1 everywhere.

 

--- Barry Song                                               2009-08-18 21:42:07

For the latter, is it possible for us to add the support to check stack overflow

for clone later?

 

 

--- Bernd Schmidt                                            2009-08-19 04:19:44

I don't think this would be a good idea.  Programs really shouldn't be using

clone directly; I noticed that the newer versions of mpd seem to have lost this

code.

 

--- Barry Song                                               2009-08-19 04:41:46

Yes, newer versions of mpd use glib thread. And it's better that programs don't

call clone directly but through pthread. But as a standard libc function or

system call, it is maybe not too reasonable to prevent the using too. Is it

possible moving thread stack check feature to the bottom of clone from the up of

clone? I have no idea.

 

 

--- Bernd Schmidt                                            2009-08-19 05:00:10

One pretty major problem is that clone doesn't get to see the lower bound of the

stack.  So no, I don't think we can or should change clone.

 

--- Mike Frysinger                                           2009-08-27 05:34:47

clone() is not a standard function at all.  it is not in any standard, and the

exact semantics change depending on the architecture.  last i checked, there

were 4 different ways clone() needed to be invoked.

 

at any rate, i built up mpd in trunk with -fstack-limit-symbol=_stack_start and

it worked fine for me.

 

Bernd: note that he is running on 2009R1 kernel here and that version is the

same FLAT code we've shipped the last few years.  i only dropped FLAT stuff in

current trunk.

 

--- Bernd Schmidt                                            2009-09-15 18:24:26

I've looked into this, and there really is a reloc outside the program - which

gets generated because there's a single function that uses a stack frame bigger

than the entire stack (it uses over 0x10000 bytes).  So, in a sense, stack

checking is working, but in a funny way, causing the program to fail early by

not even loading.

 

Is there anything that we want to improve on here?  I suppose we could detect

the problem even earlier in elf2flt, but I don't think we recommend

-fstack-limit-symbol anymore?

 

--- Mike Frysinger                                           2009-09-15 18:32:54

a sanity check in elf2flt should be pretty easy i think.

 

are there any scenarios that -mstack-check-l1 doesnt support but

-fstack-limit-symbol does ?  wont the latter work on SMP while the former wont

(ignoring cpu affinity) ?

 

 

 

    Files

    Changes

    Commits

    Dependencies

    Duplicates

    Associations

    Tags

 

File Name     File Type     File Size     Posted By

No Files Were Found

Attachments

    Outcomes