2008-07-04 05:28:59 How to disable OOM killer
Yi Li (CHINA)
Message: 58344
Qestion from a bfin-uclinux developer:
On BF537-STAMP with 64MB memory, if a application malloc() more memory than the kernel could provide, sometimes the kernel will invoke the out-of-memory killer and kill other useful processes. How to disable OOM, kernel only return memory allocation failure instead of kill inocent processes.
To reproduce, run a program like this:
#include <stdlib.h>
void main(int argc,char *argv[])
{
char *aa=NULL;
int mem_size = atoi(argv[1]) * 1024;
aa=malloc(mem_size);
if(aa==NULL)
{
printf("error!\n");
exit(1);
}
while(1);
}
root:/> ./mem 16000 &
root:/> ./mem 16000 &
root:/> free
total used free shared buffers
Mem: 55344 44096 11248 0 0
root:/> cat /proc/buddyinfo
Node 0, zone DMA 0 0 1 1 1 1 1 1 0 1
root:/> ps -ef
PID Uid VSZ Stat Command
1 root 568 S /init
2 root SW< [kthreadd]
3 root SWN [ksoftirqd/0]
4 root SW< [events/0]
5 root SW< [khelper]
23 root SW< [kblockd/0]
35 root SW [pdflush]
36 root SW [pdflush]
37 root SW< [kswapd0]
38 root SW< [aio/0]
71 root SW< [mtdblockd]
73 root SW< [bfin-spi.0]
96 root 484 S inetd
98 root 968 S -/bin/sh
99 root 476 S /bin/watchdogd -f -s
100 root 868 S /sbin/syslogd -n
101 root 868 S /sbin/klogd -n
120 root 16476 R ./mem 16000
121 root 16476 R ./mem 16000
123 root 872 R ps -ef
root:/> ./mem 16000 &
[snip]
Out of memory: kill process 121 (mem) score 252 or a child
Killed process 121 (mem)
Out of memory: kill process 98 (sh) score 11 or a child
Killed process 127 (mem)
Allocation of length 16388096 from process 127 failed
root:/> ps -ef
120: Killed
121: Killed
127: Killed
PID Uid VSZ Stat Command
1 root 568 S /init
2 root SW< [kthreadd]
3 root SWN [ksoftirqd/0]
4 root SW< [events/0]
5 root SW< [khelper]
23 root SW< [kblockd/0]
35 root SW [pdflush]
36 root SW [pdflush]
37 root SW< [kswapd0]
38 root SW< [aio/0]
71 root SW< [mtdblockd]
73 root SW< [bfin-spi.0]
96 root 484 S inetd
98 root 968 S -/bin/sh
99 root 476 S /bin/watchdogd -f -s
100 root 868 S /sbin/syslogd -n
101 root 868 S /sbin/klogd -n
128 root 872 R ps -ef
QuoteReplyEditDelete
2008-07-04 05:32:47 Re: How to disable OOM killer
Yi Li (CHINA)
Message: 58345
I tried "echo 2 > /proc/sys/vm/overcommit_memory", but that does not seem to solve the problem. Maybe you can try bellow patch, it disables oom. But I am not sure what is better solution:
Index: page_alloc.c
===================================================================
--- page_alloc.c (revision 4896)
+++ page_alloc.c (working copy)
@@ -1352,8 +1352,8 @@
if (page)
goto got_pg;
- out_of_memory(zonelist, gfp_mask, order);
- goto restart;
+ //out_of_memory(zonelist, gfp_mask, order);
+ goto nopage;
}
/*
QuoteReplyEditDelete
2008-07-04 06:34:47 Re: How to disable OOM killer
Bryan Wu (CHINA)
Message: 58347
How about this article and patch?
Avoiding the OOM killer with mem_notify
http://lwn.net/Articles/267013/
-Bryan
QuoteReplyEditDelete
2008-07-04 09:57:19 Re: How to disable OOM killer
Mike Frysinger (UNITED STATES)
Message: 58354
overcommit_memory really only works for MMU systems
disabling the OOM killer wont help anything ... it'll just make the kernel loop forever ... after all, you're out of memory
QuoteReplyEditDelete
2008-07-04 12:04:30 Re: How to disable OOM killer
Bryan Wu (CHINA)
Message: 58361
OOM Killer is useful. But it will kill the application which should not be killed.
So we need to let the user space application know such memory shorage pressure and ask use space application free some memory.
-Bryan
QuoteReplyEditDelete
2008-07-04 13:21:15 Re: How to disable OOM killer
Robin Getz (UNITED STATES)
Message: 58362
Byran:
The issue is - sysinfo() (which is how most people determine free memory) - returns freeram - Available memory size - but this is normally larger than the avalible continugous memory - which kmalloc can either return or fail.
Right now, there is no way (Other than poking through /proc/slabinfo) to find the largest avalible page.
I think the only workable solution would be to add something like that to uClibc's malloc function (checking for avalible page sizes) - and return 0 if there was not a big enough avalible page. However - this does not solve the problem - it just moves it somewhere else....
-Robin
QuoteReplyEditDelete
2008-07-04 21:48:24 Re: How to disable OOM killer
Bryan Wu (CHINA)
Message: 58364
IMHO, the OOM killer always sucks because it will kill some critical application process to free some memory. For embedded system, we can not imagine a killer hide behind and it will kill some one, then our whole system crash.
I though about 2 method which should modified use space application as well as kernel space improvement
1. if there is no such big chunk memory available, the malloc system call should return fail.
The kernel should let running application about this memory presure and ask them to free some.
But we should rewrite the application,sometimes it can not be accept.
2. Because our product is embedded system, we know every thing in running box. We can provide a mechanism to let OOM killer know which one can not be killed and which one can. For example, in an embedded webserver, the httpd application can not be killed, while some connection thread can be, because we can restart the http connect.
The out of memory issue is very common in embedded system, while OOM killer is always wrong to makethe whole system die.
-Bryan
QuoteReplyEditDelete
2008-07-04 23:48:27 Re: How to disable OOM killer
Robin Getz (UNITED STATES)
Message: 58365
Bryan:
but since OOM is invoked in kmalloc - by the time you get there - there is no way to tell if you are returning null (malloc failed in userspace) or returning null (kmalloc failed in kernel space).
I'm not sure this can be fixed just in kernel space.
-Robin
QuoteReplyEditDelete
2008-07-06 21:27:37 Re: How to disable OOM killer
Yi Li (CHINA)
Message: 58391
Some context:
The customer board is running nano-x. When some user application is invoked and the system runs out of memory, OOM kills nano-x. The system lost its UI, which is no desirable. It is hoped that kernel only returns failure, and user application will make decision on what to do next.
QuoteReplyEditDelete
2008-07-15 03:21:23 Re: How to disable OOM killer
Mike Frysinger (UNITED STATES)
Message: 58805
but what if it was nano-x which tried the malloc() ? then you've lost your UI anyways
i'm not sure there is enough information available in kmalloc for it to make the right decision: return NULL or invoke the OOM killer ...
QuoteReplyEditDelete
2008-07-15 16:28:39 Re: How to disable OOM killer
Robin Getz (UNITED STATES)
Message: 58868
Mike:
But there should be in the C library - in malloc, you could look at the largest avalible page size (look in /proc/slabinfo) and return null if there is not a big enough page (don't even ask the kernel).
?
Not a great idea - but possible.
-Robin
QuoteReplyEditDelete
2008-07-15 20:49:37 Re: How to disable OOM killer
Mike Frysinger (UNITED STATES)
Message: 58881
that doesnt take into account the kernel's ability to reap clean pages ...