FAQ:[#5968] __alloc_pages_internal() may loop endlessly under certain conditions(2010)

Document created by Aaronwu Employee on Sep 10, 2013
Version 1Show Document
  • View in full screen mode

[#5968] __alloc_pages_internal() may loop endlessly under certain conditions

Submitted By: Enrik Berkhan

Open Date

2010-03-15 05:22:39    

Priority:

Medium     Assignee:

Sonic Zhang

Status:

Open     Fixed In Release:

N/A

Found In Release:

2009R1-RC6     Release:

Category:

Memory     Board:

Custom

Processor:

BF561     Silicon Revision:

0.5

Is this bug repeatable?:

Yes     Resolution:

Fixed

Uboot version or rev.:

    Toolchain version or rev.:

2009R1

App binary format:

FDPIC     

Summary: __alloc_pages_internal() may loop endlessly under certain conditions

Details:

 

During system load testing, our systems sometimes hang forever in __alloc_pages_internal() even though plenty of memory was free. The hanging processes could be made work again by "some external event" like telnet login.

 

The hanging processes called __alloc_pages_internal() from ext4 code having __GFP_FS cleared in gfp_mask intentionally. First, I had suspected ext4, so you can find some details on the ext4 list: http://marc.info/?l=linux-ext4&m=126597928719941&w=2

 

I think one of the reasons for the behavior is calling drop_pagecache() (Blackfin specific addition, which helps a lot generally, BTW) in __alloc_pages_internal(), which can lead to try_to_free_pages() return 0 repeatedly. That in turn can trigger endless retrying in __alloc_free_pages().

 

My current fix is to call get_page_from_freelist() after each call to drop_pagecache() again, because the probability of getting the pages then seems to be very high.

 

Enrik

 

Follow-ups

 

--- Sonic Zhang                                              2010-03-29 06:44:01

You suggested workaround introduces extra performance drop.

get_page_from_freelist() is already called if your gfp_mask has __GFP_FS set.

How about only call get_page_from_freelist() after drop_pagecache() with

__GFP_FS is not set?

 

--- Sonic Zhang                                              2010-03-29 06:58:28

I think it is better to do drop_pagecache() after try_to_free_pages() in

__alloc_pages_internal() to avoid endless loop from scratch.

 

Could you try the attached patch?

 

 

 

    Files

    Changes

    Commits

    Dependencies

    Duplicates

    Associations

    Tags

 

File Name     File Type     File Size     Posted By

nommu_drop_pagecache_after_try_to_free_pages.patch    text/x-patch    1256    Sonic Zhang

ext4-oom-endless-loop-workaround.diff    text/x-diff    984    Enrik Berkhan

Outcomes