[#5968] __alloc_pages_internal() may loop endlessly under certain conditions
Submitted By: Enrik Berkhan
Open Date
2010-03-15 05:22:39
Priority:
Medium Assignee:
Sonic Zhang
Status:
Open Fixed In Release:
N/A
Found In Release:
2009R1-RC6 Release:
Category:
Memory Board:
Custom
Processor:
BF561 Silicon Revision:
0.5
Is this bug repeatable?:
Yes Resolution:
Fixed
Uboot version or rev.:
Toolchain version or rev.:
2009R1
App binary format:
FDPIC
Summary: __alloc_pages_internal() may loop endlessly under certain conditions
Details:
During system load testing, our systems sometimes hang forever in __alloc_pages_internal() even though plenty of memory was free. The hanging processes could be made work again by "some external event" like telnet login.
The hanging processes called __alloc_pages_internal() from ext4 code having __GFP_FS cleared in gfp_mask intentionally. First, I had suspected ext4, so you can find some details on the ext4 list: http://marc.info/?l=linux-ext4&m=126597928719941&w=2
I think one of the reasons for the behavior is calling drop_pagecache() (Blackfin specific addition, which helps a lot generally, BTW) in __alloc_pages_internal(), which can lead to try_to_free_pages() return 0 repeatedly. That in turn can trigger endless retrying in __alloc_free_pages().
My current fix is to call get_page_from_freelist() after each call to drop_pagecache() again, because the probability of getting the pages then seems to be very high.
Enrik
Follow-ups
--- Sonic Zhang 2010-03-29 06:44:01
You suggested workaround introduces extra performance drop.
get_page_from_freelist() is already called if your gfp_mask has __GFP_FS set.
How about only call get_page_from_freelist() after drop_pagecache() with
__GFP_FS is not set?
--- Sonic Zhang 2010-03-29 06:58:28
I think it is better to do drop_pagecache() after try_to_free_pages() in
__alloc_pages_internal() to avoid endless loop from scratch.
Could you try the attached patch?
Files
Changes
Commits
Dependencies
Duplicates
Associations
Tags
File Name File Type File Size Posted By
nommu_drop_pagecache_after_try_to_free_pages.patch text/x-patch 1256 Sonic Zhang
ext4-oom-endless-loop-workaround.diff text/x-diff 984 Enrik Berkhan