2008-07-04 05:28:59     How to disable OOM killer

Document created by Aaronwu Employee on Aug 7, 2013
Version 1Show Document
  • View in full screen mode

2008-07-04 05:28:59     How to disable OOM killer

Yi Li (CHINA)

Message: 58344   

 

Qestion from a bfin-uclinux developer:

 

On BF537-STAMP with 64MB memory, if a application malloc() more memory than the kernel could provide, sometimes the kernel will invoke the out-of-memory killer and kill other useful processes. How to disable OOM, kernel only return memory allocation failure instead of kill inocent processes.

 

To reproduce, run a program like this:

 

#include <stdlib.h>

void main(int argc,char *argv[])

{

char *aa=NULL;

int mem_size = atoi(argv[1]) * 1024;

 

aa=malloc(mem_size);

if(aa==NULL)

{

        printf("error!\n");

        exit(1);

}

while(1);

 

}

 

 

root:/> ./mem 16000 &

 

root:/> ./mem 16000 &

 

root:/> free

              total         used         free       shared      buffers

  Mem:        55344        44096        11248            0            0

root:/> cat /proc/buddyinfo

Node 0, zone      DMA      0      0      1      1      1      1      1      1      0      1

 

root:/> ps -ef

  PID  Uid        VSZ Stat Command

    1 root        568 S   /init

    2 root            SW< [kthreadd]

    3 root            SWN [ksoftirqd/0]

    4 root            SW< [events/0]

    5 root            SW< [khelper]

   23 root            SW< [kblockd/0]

   35 root            SW  [pdflush]

   36 root            SW  [pdflush]

   37 root            SW< [kswapd0]

   38 root            SW< [aio/0]

   71 root            SW< [mtdblockd]

   73 root            SW< [bfin-spi.0]

   96 root        484 S   inetd

   98 root        968 S   -/bin/sh

   99 root        476 S   /bin/watchdogd -f -s

  100 root        868 S   /sbin/syslogd -n

  101 root        868 S   /sbin/klogd -n

  120 root      16476 R   ./mem 16000

  121 root      16476 R   ./mem 16000

  123 root        872 R   ps -ef

 

root:/> ./mem 16000 &

 

[snip]

 

Out of memory: kill process 121 (mem) score 252 or a child

Killed process 121 (mem)

Out of memory: kill process 98 (sh) score 11 or a child

Killed process 127 (mem)

Allocation of length 16388096 from process 127 failed

 

root:/> ps -ef

120: Killed

121: Killed

127: Killed

  PID  Uid        VSZ Stat Command

    1 root        568 S   /init

    2 root            SW< [kthreadd]

    3 root            SWN [ksoftirqd/0]

    4 root            SW< [events/0]

    5 root            SW< [khelper]

   23 root            SW< [kblockd/0]

   35 root            SW  [pdflush]

   36 root            SW  [pdflush]

   37 root            SW< [kswapd0]

   38 root            SW< [aio/0]

   71 root            SW< [mtdblockd]

   73 root            SW< [bfin-spi.0]

   96 root        484 S   inetd

   98 root        968 S   -/bin/sh

   99 root        476 S   /bin/watchdogd -f -s

  100 root        868 S   /sbin/syslogd -n

  101 root        868 S   /sbin/klogd -n

  128 root        872 R   ps -ef

 

QuoteReplyEditDelete

 

 

2008-07-04 05:32:47     Re: How to disable OOM killer

Yi Li (CHINA)

Message: 58345   

 

I tried "echo 2 > /proc/sys/vm/overcommit_memory", but that does not seem to solve the problem.  Maybe you can try bellow patch, it disables oom. But I am not sure what is better solution:

 

Index: page_alloc.c

===================================================================

--- page_alloc.c    (revision 4896)

+++ page_alloc.c    (working copy)

@@ -1352,8 +1352,8 @@

         if (page)

             goto got_pg;

 

-        out_of_memory(zonelist, gfp_mask, order);

-        goto restart;

+        //out_of_memory(zonelist, gfp_mask, order);

+        goto nopage;

     }

 

     /*

 

QuoteReplyEditDelete

 

 

2008-07-04 06:34:47     Re: How to disable OOM killer

Bryan Wu (CHINA)

Message: 58347   

 

How about this article and patch?

 

Avoiding the OOM killer with mem_notify

 

http://lwn.net/Articles/267013/

 

-Bryan

QuoteReplyEditDelete

 

 

2008-07-04 09:57:19     Re: How to disable OOM killer

Mike Frysinger (UNITED STATES)

Message: 58354   

 

overcommit_memory really only works for MMU systems

 

disabling the OOM killer wont help anything ... it'll just make the kernel loop forever ... after all, you're out of memory

QuoteReplyEditDelete

 

 

2008-07-04 12:04:30     Re: How to disable OOM killer

Bryan Wu (CHINA)

Message: 58361   

 

OOM Killer is useful. But it will kill the application which should not be killed.

 

So we need to let the user space application know such memory shorage pressure and ask use space application free some memory.

 

-Bryan

QuoteReplyEditDelete

 

 

2008-07-04 13:21:15     Re: How to disable OOM killer

Robin Getz (UNITED STATES)

Message: 58362   

 

Byran:

 

The issue is - sysinfo() (which is how most people determine free memory) - returns freeram - Available memory size - but this is normally larger than the avalible continugous memory - which kmalloc can either return or fail.

 

Right now, there is no way (Other than poking through /proc/slabinfo) to find the largest avalible page.

 

I think the only workable solution would be to add something like that to uClibc's malloc function (checking for avalible page sizes) - and return 0 if there was not a big enough avalible page. However - this does not solve the problem - it just moves it somewhere else....

 

-Robin

QuoteReplyEditDelete

 

 

2008-07-04 21:48:24     Re: How to disable OOM killer

Bryan Wu (CHINA)

Message: 58364   

 

IMHO, the OOM killer always sucks because it will kill some critical application process to free some memory. For embedded system, we can not imagine a killer hide behind and it will kill some one, then our whole system crash.

 

I though about 2 method which should modified use space application as well as kernel space improvement

 

1. if there is no such big chunk memory available, the malloc system call should return fail.

 

The kernel should let running application about this memory presure and ask them to free some.

 

But we should rewrite the application,sometimes it can not be accept.

 

2. Because our product is embedded system, we know every thing in running box. We can provide a mechanism to let OOM killer know which one can not be killed and which one can. For example, in an embedded webserver, the httpd application can not be killed, while some connection thread can be, because we can restart the http connect.

 

The out of memory issue is very common in embedded system, while OOM killer is always wrong to makethe whole system die.

 

 

 

-Bryan

QuoteReplyEditDelete

 

 

2008-07-04 23:48:27     Re: How to disable OOM killer

Robin Getz (UNITED STATES)

Message: 58365   

 

Bryan:

 

but since OOM is invoked in kmalloc - by the time you get there - there is no way to tell if you are returning null (malloc failed in userspace) or returning null (kmalloc failed in kernel space).

 

I'm not sure this can be fixed just in kernel space.

 

-Robin

QuoteReplyEditDelete

 

 

2008-07-06 21:27:37     Re: How to disable OOM killer

Yi Li (CHINA)

Message: 58391   

 

Some context:

 

The customer board is running nano-x. When some user application is invoked and the system runs out of memory, OOM kills nano-x. The system lost its UI, which is no desirable. It is hoped that kernel only returns failure, and user application will make decision on what to do next.

 

 

QuoteReplyEditDelete

 

 

2008-07-15 03:21:23     Re: How to disable OOM killer

Mike Frysinger (UNITED STATES)

Message: 58805   

 

but what if it was nano-x which tried the malloc() ?  then you've lost your UI anyways

 

i'm not sure there is enough information available in kmalloc for it to make the right decision: return NULL or invoke the OOM killer ...

QuoteReplyEditDelete

 

 

2008-07-15 16:28:39     Re: How to disable OOM killer

Robin Getz (UNITED STATES)

Message: 58868   

 

Mike:

 

But there should be in the C library - in malloc, you could look at the largest avalible page size (look in /proc/slabinfo) and return null if there is not a big enough page (don't even ask the kernel).

 

?

 

Not a great idea - but possible.

 

-Robin

QuoteReplyEditDelete

 

 

2008-07-15 20:49:37     Re: How to disable OOM killer

Mike Frysinger (UNITED STATES)

Message: 58881   

 

that doesnt take into account the kernel's ability to reap clean pages ...

Attachments

    Outcomes