From nobody Tue Feb 10 02:50:03 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24F4530DEA4 for ; Tue, 6 Jan 2026 11:53:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767700382; cv=none; b=td/FRoPkpUngNwmgYJyUhZGYRIIUEp+onSwYKSnuTS2mYasnlMOtX99yu/NRFnpApC034kXpVCSYX8Y/MWswXKaFXTL2zNjJ/jabvGnEPabIpf2jUpMtNsvxCpNS1IUfehMxGKIWTQwRDjElBn89kgZKphtvaQqnRQpVQZJ4j2Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767700382; c=relaxed/simple; bh=qkIsVnqXMoiSXwiv//4k1D7c0nxzB/d1QFFweKRLgxo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BhpQc2fLh2hEVThQ0SUAGe0R/tymeEavzt9yml9M6V8/f69Nbp573cRvGfs6cApLiEFQWMN2/N7l0XWXyGLZNr8aXPbPPQ/ey84fMRx8XJoR0V9IKA/JAi1KbGRZstU0U+5dbxQQ01BBywbzm3+IpiBOWlYdSBFX9VkPOtyTNqI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=nMhFzuqT; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=XWt8Tip/; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=nMhFzuqT; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=XWt8Tip/; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="nMhFzuqT"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="XWt8Tip/"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="nMhFzuqT"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="XWt8Tip/" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 28A30339E7; Tue, 6 Jan 2026 11:52:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1767700361; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PD2nK4yyRFKk9zNZMhwG4gs4iSjrSsb1YK98rOdGu6s=; b=nMhFzuqTslp4J5jcA3ZLOG+GGmUsgs1jGA+zYK4+TSAOwF5ntks1N+hGU3BvxNcW+LZ25r tlOoCsFhH1vNyPvTi5ilKK5jX/o87VcQo78ePh351hXFijZEz9VP2jhHyTihak5YFG6mKN l+IiRcfhSFQR08CHvM2Yx0BI3nCsOzg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1767700361; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PD2nK4yyRFKk9zNZMhwG4gs4iSjrSsb1YK98rOdGu6s=; b=XWt8Tip/hbFvcJoIxcUhHXpTEhFWX0UAEKxoZtbQ/AWhT/GPwamjGvfWxHByhnboLG3V2e Ipa9UTESEBs7OqCA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1767700361; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PD2nK4yyRFKk9zNZMhwG4gs4iSjrSsb1YK98rOdGu6s=; b=nMhFzuqTslp4J5jcA3ZLOG+GGmUsgs1jGA+zYK4+TSAOwF5ntks1N+hGU3BvxNcW+LZ25r tlOoCsFhH1vNyPvTi5ilKK5jX/o87VcQo78ePh351hXFijZEz9VP2jhHyTihak5YFG6mKN l+IiRcfhSFQR08CHvM2Yx0BI3nCsOzg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1767700361; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PD2nK4yyRFKk9zNZMhwG4gs4iSjrSsb1YK98rOdGu6s=; b=XWt8Tip/hbFvcJoIxcUhHXpTEhFWX0UAEKxoZtbQ/AWhT/GPwamjGvfWxHByhnboLG3V2e Ipa9UTESEBs7OqCA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 0A5DC3EA66; Tue, 6 Jan 2026 11:52:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id iCFFAon3XGnsZwAAD6G6ig (envelope-from ); Tue, 06 Jan 2026 11:52:41 +0000 From: Vlastimil Babka Date: Tue, 06 Jan 2026 12:52:37 +0100 Subject: [PATCH mm-unstable v3 2/3] mm/page_alloc: refactor the initial compaction handling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260106-thp-thisnode-tweak-v3-2-f5d67c21a193@suse.cz> References: <20260106-thp-thisnode-tweak-v3-0-f5d67c21a193@suse.cz> In-Reply-To: <20260106-thp-thisnode-tweak-v3-0-f5d67c21a193@suse.cz> To: Andrew Morton , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , David Rientjes , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , Joshua Hahn , Pedro Falcato Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Vlastimil Babka X-Mailer: b4 0.14.3 X-Spam-Flag: NO X-Spam-Score: -6.80 X-Spam-Level: X-Spamd-Result: default: False [-6.80 / 50.00]; REPLY(-4.00)[]; BAYES_HAM(-3.00)[100.00%]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.995]; MIME_GOOD(-0.10)[text/plain]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCVD_VIA_SMTP_AUTH(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[16]; TAGGED_RCPT(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; R_RATELIMIT(0.00)[to_ip_from(RL8ogcagzi1y561i1mcnzpnkwh)]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; FREEMAIL_TO(0.00)[linux-foundation.org,google.com,suse.com,cmpxchg.org,nvidia.com,kernel.org,oracle.com,gmail.com,suse.de]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.cz:email,suse.cz:mid] The initial direct compaction done in some cases in __alloc_pages_slowpath() stands out from the main retry loop of reclaim + compaction. We can simplify this by instead skipping the initial reclaim attempt via a new local variable compact_first, and handle the compact_prority as necessary to match the original behavior. No functional change intended. Suggested-by: Johannes Weiner Signed-off-by: Vlastimil Babka Reviewed-by: Joshua Hahn Acked-by: Michal Hocko --- include/linux/gfp.h | 8 ++++- mm/page_alloc.c | 100 +++++++++++++++++++++++++-----------------------= ---- 2 files changed, 55 insertions(+), 53 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index aa45989f410d..6ecf6dda93e0 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -407,9 +407,15 @@ extern gfp_t gfp_allowed_mask; /* Returns true if the gfp_mask allows use of ALLOC_NO_WATERMARK */ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask); =20 +/* A helper for checking if gfp includes all the specified flags */ +static inline bool gfp_has_flags(gfp_t gfp, gfp_t flags) +{ + return (gfp & flags) =3D=3D flags; +} + static inline bool gfp_has_io_fs(gfp_t gfp) { - return (gfp & (__GFP_IO | __GFP_FS)) =3D=3D (__GFP_IO | __GFP_FS); + return gfp_has_flags(gfp, __GFP_IO | __GFP_FS); } =20 /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b06b1cb01e0e..3b2579c5716f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4702,7 +4702,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int o= rder, struct alloc_context *ac) { bool can_direct_reclaim =3D gfp_mask & __GFP_DIRECT_RECLAIM; - bool can_compact =3D gfp_compaction_allowed(gfp_mask); + bool can_compact =3D can_direct_reclaim && gfp_compaction_allowed(gfp_mas= k); bool nofail =3D gfp_mask & __GFP_NOFAIL; const bool costly_order =3D order > PAGE_ALLOC_COSTLY_ORDER; struct page *page =3D NULL; @@ -4715,6 +4715,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int o= rder, unsigned int cpuset_mems_cookie; unsigned int zonelist_iter_cookie; int reserve_flags; + bool compact_first =3D false; =20 if (unlikely(nofail)) { /* @@ -4738,6 +4739,19 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int = order, cpuset_mems_cookie =3D read_mems_allowed_begin(); zonelist_iter_cookie =3D zonelist_iter_begin(); =20 + /* + * For costly allocations, try direct compaction first, as it's likely + * that we have enough base pages and don't need to reclaim. For non- + * movable high-order allocations, do that as well, as compaction will + * try prevent permanent fragmentation by migrating from blocks of the + * same migratetype. + */ + if (can_compact && (costly_order || (order > 0 && + ac->migratetype !=3D MIGRATE_MOVABLE))) { + compact_first =3D true; + compact_priority =3D INIT_COMPACT_PRIORITY; + } + /* * The fast path uses conservative alloc_flags to succeed only until * kswapd needs to be woken up, and to avoid the cost of setting up @@ -4780,53 +4794,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int = order, if (page) goto got_pg; =20 - /* - * For costly allocations, try direct compaction first, as it's likely - * that we have enough base pages and don't need to reclaim. For non- - * movable high-order allocations, do that as well, as compaction will - * try prevent permanent fragmentation by migrating from blocks of the - * same migratetype. - * Don't try this for allocations that are allowed to ignore - * watermarks, as the ALLOC_NO_WATERMARKS attempt didn't yet happen. - */ - if (can_direct_reclaim && can_compact && - (costly_order || - (order > 0 && ac->migratetype !=3D MIGRATE_MOVABLE)) - && !gfp_pfmemalloc_allowed(gfp_mask)) { - page =3D __alloc_pages_direct_compact(gfp_mask, order, - alloc_flags, ac, - INIT_COMPACT_PRIORITY, - &compact_result); - if (page) - goto got_pg; - - /* - * Checks for costly allocations with __GFP_NORETRY, which - * includes some THP page fault allocations - */ - if (costly_order && (gfp_mask & __GFP_NORETRY)) { - /* - * THP page faults may attempt local node only first, - * but are then allowed to only compact, not reclaim, - * see alloc_pages_mpol(). - * - * Compaction has failed above and we don't want such - * THP allocations to put reclaim pressure on a single - * node in a situation where other nodes might have - * plenty of available memory. - */ - if (gfp_mask & __GFP_THISNODE) - goto nopage; - - /* - * Proceed with single round of reclaim/compaction, but - * since sync compaction could be very expensive, keep - * using async compaction. - */ - compact_priority =3D INIT_COMPACT_PRIORITY; - } - } - retry: /* * Deal with possible cpuset update races or zonelist updates to avoid @@ -4870,10 +4837,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int= order, goto nopage; =20 /* Try direct reclaim and then allocating */ - page =3D __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, - &did_some_progress); - if (page) - goto got_pg; + if (!compact_first) { + page =3D __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, + ac, &did_some_progress); + if (page) + goto got_pg; + } =20 /* Try direct compaction and then allocating */ page =3D __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac, @@ -4881,6 +4850,33 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int = order, if (page) goto got_pg; =20 + if (compact_first) { + /* + * THP page faults may attempt local node only first, but are + * then allowed to only compact, not reclaim, see + * alloc_pages_mpol(). + * + * Compaction has failed above and we don't want such THP + * allocations to put reclaim pressure on a single node in a + * situation where other nodes might have plenty of available + * memory. + */ + if (gfp_has_flags(gfp_mask, __GFP_NORETRY | __GFP_THISNODE)) + goto nopage; + + /* + * For the initial compaction attempt we have lowered its + * priority. Restore it for further retries, if those are + * allowed. With __GFP_NORETRY there will be a single round of + * reclaim and compaction with the lowered priority. + */ + if (!(gfp_mask & __GFP_NORETRY)) + compact_priority =3D DEF_COMPACT_PRIORITY; + + compact_first =3D false; + goto retry; + } + /* Do not loop if specifically requested */ if (gfp_mask & __GFP_NORETRY) goto nopage; --=20 2.52.0