2010-06-23 05:31:19     user Memory allocation

Document created by Aaronwu Employee on Aug 22, 2013
Version 1Show Document
  • View in full screen mode

2010-06-23 05:31:19     user Memory allocation

Prasanth Rajagopal (INDIA)

Message: 90559   

 

I am trying to port an open source algorithm to Blackfin. The code build and runs as expected on my Host Linux platform, when built with gcc. I am attempting to do the same on  BF548 0.2 Si Rev EZ-KIT Now.

 

There is no particular need to run it with Linux, I just need to run the algorithm, and measure the cycle count. There is no hardware dependency at all.

 

Since I havent started looking at the bare metal toolchain, I went on to build the code with bfin-uclinux-gcc (as I do normally). It got built, but crashed with program allocation error as follows:-

 

>>>>>>>>>

 

root:/bin> ./main.o

Allocation of length 48869376 from process 404 failed

DMA per-cpu:

CPU    0: hi:   18, btch:   3 usd:   8

Active_anon:0 active_file:191 inactive_anon:0

inactive_file:2237 dirty:0 writeback:0 unstable:0

free:11129 slab:751 mapped:0 pagetables:0 bounce:0

DMA free:44516kB min:4096kB low:5120kB high:6144kB active_anon:0kB inactive_anon:0kB active_file:764kB inactive_file:8948kB present:62988kB pages_scanned:0 all_unreclaimable? no

lowmem_reserve[]: 0 0 0

DMA: 1*4kB 0*8kB 2*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 2*4096kB 0*8192kB 2*16384kB 0*32768kB = 44516kB

2441 total pagecache pages

Unable to allocate RAM for process text/data, errno 12

Undefined instruction

- May be used to emulate instructions that are not defined for

   a particular processor implementation.

Deferred Exception context

CURRENT PROCESS:

COMM=main.o PID=404

CPU = 0

TEXT = 0x00000000-0x00000000        DATA = 0x00000000-0x00000000

BSS = 0x00000000-0x00000000  USER-STACK = 0x00000000

 

return address: [0x031dc93c]; contents of:

0x031dc910:  5438  0324  5440  0324  544c  0324  5468  0324

0x031dc920:  5470  0324  5478  0324  5480  0324  0000  0000

0x031dc930:  5488  0324  5494  0324  5668  0324 [0001] 0000

0x031dc940:  5670  0324  0002  0000  5678  0324  0007  0000

 

ADSP-BF548-0.2 525(MHz CCLK) 131(MHz SCLK) (mpu off)

Linux version 2.6.28.10-ADI-2009R1.1

Built with gcc version 4.1.2 (ADI svn)

 

SEQUENCER STATUS:        Not tainted

SEQSTAT: 00000021  IPEND: 0030  SYSCFG: 0006

  EXCAUSE   : 0x21

  interrupts disabled

  physical IVG5 asserted : <0xffa00be4> { _evt_ivhw + 0x0 }

RETE: <0x00000000> /* Maybe null pointer? */

RETN: <0x032ac000> { :hid_tmff:_hid_compat_thrustmaster + 0x0 }

RETX: <0x00000480> /* Maybe fixed code section */

RETS: <0x031dc8f0> [ /bin/busybox + 0x4d8f0 ]

PC  : <0x031dc93c> [ /bin/busybox + 0x4d93c ]

DCPLB_FAULT_ADDR: <0x0333fa64> [ sh + 0x1fa64 ]

ICPLB_FAULT_ADDR: <0x031dc93c> [ /bin/busybox + 0x4d93c ]

 

>>>

 

Meminfo:

 

root:/bin> cat /proc/meminfo

MemTotal:          60264 kB

MemFree:           44340 kB

Buffers:               0 kB

Cached:             9776 kB

SwapCached:            0 kB

Active:              764 kB

Inactive:           9004 kB

Active(anon):          0 kB

Inactive(anon):        0 kB

Active(file):        764 kB

Inactive(file):     9004 kB

SwapTotal:             0 kB

SwapFree:              0 kB

Dirty:                 0 kB

Writeback:             0 kB

AnonPages:             0 kB

Mapped:                0 kB

Slab:               2996 kB

SReclaimable:       1908 kB

SUnreclaim:         1088 kB

PageTables:            0 kB

NFS_Unstable:          0 kB

Bounce:                0 kB

WritebackTmp:          0 kB

CommitLimit:       30132 kB

Committed_AS:          0 kB

VmallocTotal:          0 kB

VmallocUsed:           0 kB

VmallocChunk:          0 kB

>>>>>>>>>>>>>>>>>

 

There is a printf statement in main immediately after static allocations, that never got executed. The error appears to tell me that the memory requirements of the program is more than what is available (44340KB vs 47724KB). Is 48869376 the total size for text + data. The algorithm mostly uses dynamic allocation, so does main(). I tried optimizing code (perhaps I wasn't sure if it was text related) but in vain.

 

Considering my requirement above, what would be the best option? Should I strip out the Kernel and make it bare minimum (I wouldnt touch Kernel unless I am sure what I am doing)? Is there another way in the user code to tackle this? I can't modify the algo also. Perhaps I should start looking at the bare metal? Any thoughts / comments are welcome.

 

Let me know if I need furnish any more diagnotic info.

 

Thanks for the time.

 

- Prasanth.

QuoteReplyEditDelete

 

 

2010-06-23 11:19:17     Re: user Memory allocation

Mike Frysinger (UNITED STATES)

Message: 90565   

 

there's no way an allocation of that size (48MB) is going to work on a 64MB board

 

you'll have to run `bfin-uclinux-flthdr` on your program to figure out how big the data/text sections, but i dont believe FLAT can even support execs that big.  more likely you have an unreasonably large uninitialized array (in your bss).

 

use `readelf -s` on the .gdb file to look at the symbols and their sizes.

QuoteReplyEditDelete

 

 

2010-06-23 14:46:35     Re: user Memory allocation

Prasanth Rajagopal (INDIA)

Message: 90567   

 

Thanks for pointing that out, I just saw this text in the related document:-

 

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

 

4.2.3.1. Troubleshooting

 

If the program dies immediately on start-up with no useful error message, it's likely that too much memory is being allocated for the implementation's many static arrays. Depending on what platform the program is being used to compile and execute the software, there may be ways to increase the amount of space a process can allocate for static arrays. For example, the shells "tcsh" and "bash" have the built-in commands "limit" and "ulimit," respectively, to control various run-time limits, such as a process' maximum stack and data segment sizes. The "gcc" compiler has many options that can also affect a program's memory allocation, although many are platform-dependent.

 

/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

 

I now have an idea where it is coming from (the arrays you mentioned about), I will look at the hints you gave. I know that this same algorithm was used by another open source project, and someone there informed me that it worked for ARM as well, probably with another gcc variant. The project was meant for Linux desktops, so perhaps the portable/embedded systems where people tested the code actually had more memory.

 

I thought the runtime loader basically allocates the amount of space reserved for the .bss and zeroes it out before any userland code begins executing (just my understanding...). When you said 'unreasonable', are you hinting that the memory requirement is too high (as in the above text)?

 

Meanwhile, what do you think about my idea of running the app as bare metal?

 

- Prasanth.

QuoteReplyEditDelete

 

 

2010-06-23 14:57:57     Re: user Memory allocation

Mike Frysinger (UNITED STATES)

Message: 90568   

 

ulimit wont help because you arent on a mmu system.  read the overcommit part of:

  docs.blackfin.uclinux.org/doku.php?id=uclinux-dist:difference_from_linux#virtual_memory

 

with FLAT, the kernel is the loader.  there is no dynamic ldso like with ELF files.

 

in terms of algorithm measurement, running under a "quiet" Linux probably wont give you much different results than bare metal.  and it'll probably be easier to develop under and/or debug.  some people even find Linux faster because it tunes all the caches and such which not everyone can do as well.

QuoteReplyEditDelete

 

 

2010-06-24 00:30:00     Re: user Memory allocation

Prasanth Rajagopal (INDIA)

Message: 90575   

 

If I use baremetal and if I can manage stuffs with VDSP (I am a regular that side), then will I be able to get over the allocation issue without modifying the algorithm (which would mean I need to think of some run time code that does the initialization, and additionally there is no overhead/constraints of the Kernel or FLAT format)?

 

Thanks

 

Prasanth.

QuoteReplyEditDelete

 

 

2010-06-24 05:23:39     Re: user Memory allocation

Prasanth Rajagopal (INDIA)

Message: 90590   

 

Along with the above question, how about if I make the arrays as dynamic instead of static?

 

- Prasanth.

QuoteReplyEditDelete

 

 

2010-06-24 12:50:36     Re: user Memory allocation

Mike Frysinger (UNITED STATES)

Message: 90596   

 

it depends.  if you ultimately attempt to hold the same amount of memory (48MiB) at the same point in time, it isnt going to work.  if you only allocate buffers as you need them (and they're much much smaller) and then free them when you're done, then it probably will work fine.

QuoteReplyEditDelete

 

 

2010-06-24 12:51:09     Re: user Memory allocation

Mike Frysinger (UNITED STATES)

Message: 90597   

 

i dont know what you mean by "manage stuffs with VDSP".  if you're going to use VDSP, then you should just use VDSP.  you cant mix the two toolchains.

 

QuoteReplyEditDelete

 

 

2010-06-24 14:13:50     Re: user Memory allocation

Prasanth Rajagopal (INDIA)

Message: 90599   

 

Actually I was referring with respect to this part in readme, about the role of VDSP when using the baremetal toolchain:

 

 

 

"Also, please keep in mind that this is just a toolchain for compiling code.  It is certainly not an IDE, so do not expect some GUI system to help you edit code or manage your projects. "

 

Apart from that, do you think using bare metal tool-chain could get me out of the memory allocation issues, considering that I am only interested in porting and calculating MIPS for an algorithm built with gcc?

 

 

 

>>>

 

In particular, this thread has my questions on bare metal.

 

https://blackfin.uclinux.org/gf/project/toolchain/forum/?_forum_action=ForumMessageBrowse&thread_id=41480&action=ForumBrowse&forum_id=44

QuoteReplyEditDelete

 

 

2010-06-24 14:28:27     Re: user Memory allocation

Mike Frysinger (UNITED STATES)

Message: 90604   

 

i dont have access to the program you're playing with so i have no idea if bare metal will make a difference.  the only difference between bare metal and linux there is you obviously have full control over the memory layout with bare metal and there is no OS which has allocated resources.  so the limitation is the hardware on the board you're using.

QuoteReplyEditDelete

 

 

2010-06-28 09:15:34     Re: user Memory allocation

Prasanth Rajagopal (INDIA)

Message: 90681   

 

Mike:

 

I have succeeded with VDSP & just trying to experiment with Linux environment (not bare metal for now, since VDSP code is working)...

 

I saw some articles here about uClinux ::

 

  www.beyondlogic.org/uClinux/bflt.htm

 

  www.kdvelectronics.eu/uClinux-cisco2500/exotica.html

 

In the above article for Cisco platform, there are some sections::

 

In section:: 3. Requirements for running 'large' programs.

 

3.1 kernel: BIGALLOCS

3.2 user.ld

 

I am wondering how valid the above is, in my case of large memory requirements? Any thoughts?

 

- Prasanth.

QuoteReplyEditDelete

 

 

2010-06-28 10:29:37     Re: user Memory allocation

Robin Getz (UNITED STATES)

Message: 90686   

 

Prasanth:

 

If you have succeeded with VDSP - use that.

 

The Linux questions you are asking make no sense. Go back, and re-read Mike's previous answers - if you don't understand them, there are many Linux primers which explain things.

 

https://docs.blackfin.uclinux.org/doku.php?id=references_and_pointers#good_books

 

-Robin

QuoteReplyEditDelete

 

 

2010-06-28 13:01:22     Re: user Memory allocation

Mike Frysinger (UNITED STATES)

Message: 90700   

 

the bigallocs code is irrelevant.  as you already showed, you're trying to allocate a contiguous chunk that is larger than your free memory.  it simply isnt going to work under Linux.

QuoteReplyEditDelete

Attachments

    Outcomes