From nobody Mon Feb 9 08:27:55 2026 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CECE328263 for ; Fri, 19 Dec 2025 17:38:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766165936; cv=none; b=l8e5NTxiYHfWskmsrwM+L5L4OWc58dBd4+tiPUdaGiK8zssJUZBMjxZaXjGtpcUWwpNQf5Iwqmb+UcyWYN3foFduYmoPxHp2t4E9J5S7whzU4wik24wP+jUwS0vb0e/oSWw4RUG68+BTDNEYHNxWSHDicTXQH+Tc+m6ZFqXGW5I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766165936; c=relaxed/simple; bh=kgMAX82XhJa3+gHxk/8yVjd2rQBq/nXBH28N6T8QaPk=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=G1HWJJNdcF1FSKpSNP47RjzrbxCGpddgE7f/wNPPO90phEN+I5lS8Pv9cKtxcGJC+dp7Z+6sj/SO5GJJinzqcPa5DP2IXL9XtlLWFSoImG4rvJoTYuCb8xn8pll2JJGgI2pAAz7VEW+4ML3lHrP06nOf5CV7QRoQxEAhx0Aq+6Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id CA58D5BD00; Fri, 19 Dec 2025 17:38:52 +0000 (UTC) Authentication-Results: smtp-out2.suse.de; none Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id ABACA3EA66; Fri, 19 Dec 2025 17:38:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id aHaiKayNRWlTYAAAD6G6ig (envelope-from ); Fri, 19 Dec 2025 17:38:52 +0000 From: Vlastimil Babka Date: Fri, 19 Dec 2025 18:38:52 +0100 Subject: [PATCH RFC v2 2/3] mm/page_alloc: refactor the initial compaction handling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251219-thp-thisnode-tweak-v2-2-0c01f231fd1c@suse.cz> References: <20251219-thp-thisnode-tweak-v2-0-0c01f231fd1c@suse.cz> In-Reply-To: <20251219-thp-thisnode-tweak-v2-0-0c01f231fd1c@suse.cz> To: Andrew Morton , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , David Rientjes , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , Joshua Hahn , Pedro Falcato Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vlastimil Babka X-Mailer: b4 0.14.3 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 50.00]; REPLY(-4.00)[]; TAGGED_RCPT(0.00)[] X-Spam-Flag: NO X-Spam-Score: -4.00 X-Rspamd-Queue-Id: CA58D5BD00 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Level: The initial direct compaction done in some cases in __alloc_pages_slowpath() stands out from the main retry loop of reclaim + compaction. We can simplify this by instead skipping the initial reclaim attempt via a new local variable compact_first, and handle the compact_prority to match the original behavior. Suggested-by: Johannes Weiner Signed-off-by: Vlastimil Babka Reviewed-by: Joshua Hahn --- mm/page_alloc.c | 106 +++++++++++++++++++++++++++++-----------------------= ---- 1 file changed, 54 insertions(+), 52 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9e7b0967f1b5..cb8965fd5e20 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4512,6 +4512,11 @@ static bool oom_reserves_allowed(struct task_struct = *tsk) return true; } =20 +static inline bool gfp_thisnode_noretry(gfp_t gfp_mask) +{ + return (gfp_mask & __GFP_NORETRY) && (gfp_mask & __GFP_THISNODE); +} + /* * Distinguish requests which really need access to full memory * reserves from oom victims which can live with a portion of it @@ -4664,7 +4669,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int o= rder, struct alloc_context *ac) { bool can_direct_reclaim =3D gfp_mask & __GFP_DIRECT_RECLAIM; - bool can_compact =3D gfp_compaction_allowed(gfp_mask); + bool can_compact =3D can_direct_reclaim && gfp_compaction_allowed(gfp_mas= k); bool nofail =3D gfp_mask & __GFP_NOFAIL; const bool costly_order =3D order > PAGE_ALLOC_COSTLY_ORDER; struct page *page =3D NULL; @@ -4677,6 +4682,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int o= rder, unsigned int cpuset_mems_cookie; unsigned int zonelist_iter_cookie; int reserve_flags; + bool compact_first =3D false; =20 if (unlikely(nofail)) { /* @@ -4700,6 +4706,19 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int = order, cpuset_mems_cookie =3D read_mems_allowed_begin(); zonelist_iter_cookie =3D zonelist_iter_begin(); =20 + /* + * For costly allocations, try direct compaction first, as it's likely + * that we have enough base pages and don't need to reclaim. For non- + * movable high-order allocations, do that as well, as compaction will + * try prevent permanent fragmentation by migrating from blocks of the + * same migratetype. + */ + if (can_compact && (costly_order || (order > 0 && + ac->migratetype !=3D MIGRATE_MOVABLE))) { + compact_first =3D true; + compact_priority =3D INIT_COMPACT_PRIORITY; + } + /* * The fast path uses conservative alloc_flags to succeed only until * kswapd needs to be woken up, and to avoid the cost of setting up @@ -4742,53 +4761,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int = order, if (page) goto got_pg; =20 - /* - * For costly allocations, try direct compaction first, as it's likely - * that we have enough base pages and don't need to reclaim. For non- - * movable high-order allocations, do that as well, as compaction will - * try prevent permanent fragmentation by migrating from blocks of the - * same migratetype. - * Don't try this for allocations that are allowed to ignore - * watermarks, as the ALLOC_NO_WATERMARKS attempt didn't yet happen. - */ - if (can_direct_reclaim && can_compact && - (costly_order || - (order > 0 && ac->migratetype !=3D MIGRATE_MOVABLE)) - && !gfp_pfmemalloc_allowed(gfp_mask)) { - page =3D __alloc_pages_direct_compact(gfp_mask, order, - alloc_flags, ac, - INIT_COMPACT_PRIORITY, - &compact_result); - if (page) - goto got_pg; - - /* - * Checks for costly allocations with __GFP_NORETRY, which - * includes some THP page fault allocations - */ - if (costly_order && (gfp_mask & __GFP_NORETRY)) { - /* - * THP page faults may attempt local node only first, - * but are then allowed to only compact, not reclaim, - * see alloc_pages_mpol(). - * - * Compaction has failed above and we don't want such - * THP allocations to put reclaim pressure on a single - * node in a situation where other nodes might have - * plenty of available memory. - */ - if (gfp_mask & __GFP_THISNODE) - goto nopage; - - /* - * Proceed with single round of reclaim/compaction, but - * since sync compaction could be very expensive, keep - * using async compaction. - */ - compact_priority =3D INIT_COMPACT_PRIORITY; - } - } - retry: /* * Deal with possible cpuset update races or zonelist updates to avoid @@ -4832,10 +4804,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int= order, goto nopage; =20 /* Try direct reclaim and then allocating */ - page =3D __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, - &did_some_progress); - if (page) - goto got_pg; + if (!compact_first) { + page =3D __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, + ac, &did_some_progress); + if (page) + goto got_pg; + } =20 /* Try direct compaction and then allocating */ page =3D __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac, @@ -4843,6 +4817,34 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int = order, if (page) goto got_pg; =20 + if (compact_first) { + /* + * THP page faults may attempt local node only first, but are + * then allowed to only compact, not reclaim, see + * alloc_pages_mpol(). + * + * Compaction has failed above and we don't want such THP + * allocations to put reclaim pressure on a single node in a + * situation where other nodes might have plenty of available + * memory. + */ + if (gfp_thisnode_noretry(gfp_mask)) + goto nopage; + + /* + * For the initial compaction attempt we have lowered its + * priority. Restore it for further retries. With __GFP_NORETRY + * there will be a single round of reclaim+compaction with the + * lowered priority. + */ + if (!(gfp_mask & __GFP_NORETRY)) { + compact_priority =3D DEF_COMPACT_PRIORITY; + } + + compact_first =3D false; + goto retry; + } + /* Do not loop if specifically requested */ if (gfp_mask & __GFP_NORETRY) goto nopage; --=20 2.52.0