[RESEND PATCH V2] mm: page_alloc: unreserve highatomic page blocks before oom

Charan Teja Kalla posted 1 patch 2 years ago
mm/page_alloc.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
[RESEND PATCH V2] mm: page_alloc: unreserve highatomic page blocks before oom
Posted by Charan Teja Kalla 2 years ago
__alloc_pages_direct_reclaim() is called from slowpath allocation where
high atomic reserves can be unreserved after there is a progress in
reclaim and yet no suitable page is found. Later should_reclaim_retry()
gets called from slow path allocation to decide if the reclaim needs to
be retried before OOM kill path is taken.

should_reclaim_retry() checks the available(reclaimable + free pages)
memory against the min wmark levels of a zone and returns:
a)  true, if it is above the min wmark so that slow path allocation will
do the reclaim retries.
b) false, thus slowpath allocation takes oom kill path.

should_reclaim_retry() can also unreserves the high atomic reserves
**but only after all the reclaim retries are exhausted.**

In a case where there are almost none reclaimable memory and free pages
contains mostly the high atomic reserves but allocation context can't
use these high atomic reserves, makes the available memory below min
wmark levels hence false is returned from should_reclaim_retry() leading
the allocation request to take OOM kill path. This can turn into a early
oom kill if high atomic reserves are holding lot of free memory and
unreserving of them is not attempted.

(early)OOM is encountered on a VM with the below state:
[  295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB
high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB
active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB
present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB
local_pcp:492kB free_cma:0kB
[  295.998656] lowmem_reserve[]: 0 32
[  295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH)
33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 7752kB

Per above log, the free memory of ~7MB exist in the high atomic
reserves is not freed up before falling back to oom kill path.

Fix it by trying to unreserve the high atomic reserves in
should_reclaim_retry() before __alloc_pages_direct_reclaim() can
fallback to oom kill path.

Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand")
Reported-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
---
Changes in V2 and RESEND: 
 o Unreserve the high atomic pageblock from should_reclaim_retry()
 o Collected the tags by Michal.
 o Start a separate discussion for high atomic reserves.
 o https://lore.kernel.org/linux-mm/cover.1699104759.git.quic_charante@quicinc.com/#r

Changes in V1:
 o Unreserving the high atomic page blocks is tried to fix from
   the oom kill path rather than in should_reclaim_retry()
 o https://lore.kernel.org/linux-mm/1698669590-3193-1-git-send-email-quic_charante@quicinc.com/

 mm/page_alloc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 733732e..6d2a741 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3951,14 +3951,9 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 	else
 		(*no_progress_loops)++;
 
-	/*
-	 * Make sure we converge to OOM if we cannot make any progress
-	 * several times in the row.
-	 */
-	if (*no_progress_loops > MAX_RECLAIM_RETRIES) {
-		/* Before OOM, exhaust highatomic_reserve */
-		return unreserve_highatomic_pageblock(ac, true);
-	}
+	if (*no_progress_loops > MAX_RECLAIM_RETRIES)
+		goto out;
+
 
 	/*
 	 * Keep reclaiming pages while there is a chance this will lead
@@ -4001,6 +3996,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		schedule_timeout_uninterruptible(1);
 	else
 		cond_resched();
+out:
+	/* Before OOM, exhaust highatomic_reserve */
+	if (!ret)
+		return unreserve_highatomic_pageblock(ac, true);
+
 	return ret;
 }
 
-- 
2.7.4
Re: [RESEND PATCH V2] mm: page_alloc: unreserve highatomic page blocks before oom
Posted by Joakim Tjernlund 1 year, 11 months ago
Could this patch be backported to stable? I have seen similar OOMs with
reserved_highatomic:4096KB

Upstream commit: 04c8716f7b0075def05dc05646e2408f318167d2

  Jocke 
Re: [RESEND PATCH V2] mm: page_alloc: unreserve highatomic page blocks before oom
Posted by Greg KH 1 year, 10 months ago
On Thu, Jan 18, 2024 at 05:23:58PM +0000, Joakim Tjernlund wrote:
> Could this patch be backported to stable? I have seen similar OOMs with
> reserved_highatomic:4096KB
> 
> Upstream commit: 04c8716f7b0075def05dc05646e2408f318167d2

Backported to exactly where?  This commit is in the 4.20 kernel and
newer, please tell me you aren't relying on the 4.19.y kernel anymore...

thanks,

greg k-h
Re: [RESEND PATCH V2] mm: page_alloc: unreserve highatomic page blocks before oom
Posted by David Rientjes 2 years ago
On Fri, 24 Nov 2023, Charan Teja Kalla wrote:

> __alloc_pages_direct_reclaim() is called from slowpath allocation where
> high atomic reserves can be unreserved after there is a progress in
> reclaim and yet no suitable page is found. Later should_reclaim_retry()
> gets called from slow path allocation to decide if the reclaim needs to
> be retried before OOM kill path is taken.
> 
> should_reclaim_retry() checks the available(reclaimable + free pages)
> memory against the min wmark levels of a zone and returns:
> a)  true, if it is above the min wmark so that slow path allocation will
> do the reclaim retries.
> b) false, thus slowpath allocation takes oom kill path.
> 
> should_reclaim_retry() can also unreserves the high atomic reserves
> **but only after all the reclaim retries are exhausted.**
> 
> In a case where there are almost none reclaimable memory and free pages
> contains mostly the high atomic reserves but allocation context can't
> use these high atomic reserves, makes the available memory below min
> wmark levels hence false is returned from should_reclaim_retry() leading
> the allocation request to take OOM kill path. This can turn into a early
> oom kill if high atomic reserves are holding lot of free memory and
> unreserving of them is not attempted.
> 
> (early)OOM is encountered on a VM with the below state:
> [  295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB
> high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB
> active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB
> present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB
> local_pcp:492kB free_cma:0kB
> [  295.998656] lowmem_reserve[]: 0 32
> [  295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH)
> 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
> 0*4096kB = 7752kB
> 
> Per above log, the free memory of ~7MB exist in the high atomic
> reserves is not freed up before falling back to oom kill path.
> 
> Fix it by trying to unreserve the high atomic reserves in
> should_reclaim_retry() before __alloc_pages_direct_reclaim() can
> fallback to oom kill path.
> 
> Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand")
> Reported-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>

Acked-by: David Rientjes <rientjes@google.com>