From nobody Tue Feb 10 20:49:01 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEB1D342538 for ; Fri, 23 Jan 2026 06:54:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769151248; cv=none; b=nAxxlWlMxErYf/fYRUyEnjlwVPJ3/st412+kdW2mDhDyVgAxF5NGYAaztdxLEqvbABMsYOezFawFlczbJwH4pORqHbe9wVgfsiTlvKng1YE1ZveahAfylYiUHFyjGxhXkjCvOtGam7g7ONb7DH7tpQHuEx1pbFcqLmIMSBzxbZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769151248; c=relaxed/simple; bh=NnfHUsrqpTNSHrkuxSVT6BkgFcuUwewUSVlPJ7we430=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=frbTSFjvSIOSsBfF+aec9LZIqrSNpZ/s2hs1dQxhuKHFW9psNOcYQpod1+1I98NtQW50fn1cLahy5p4lrwxx2uPvUyTGnQPnw1wf/J/98KmOOSxU6ANjTCEQsqgnpEXBp1Bbm3gK7BbbF0QaPB+KrdxLjeIkiQbEdGWPSHwDfGo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=AenYz2Wq; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=2jAD4uRN; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=AenYz2Wq; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=2jAD4uRN; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="AenYz2Wq"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="2jAD4uRN"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="AenYz2Wq"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="2jAD4uRN" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C13383376E; Fri, 23 Jan 2026 06:53:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1769151190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rV01kHooLadLBYTqNh94OxZ2PfqtzMHrFv9yp7Y7yLs=; b=AenYz2WqR3hKBS9aBpv7ivOXMmyu4wNyqUZbqps1NSt+kGHN+Hg4CM8nbtrz/G0/EMGZ0S vRpMKgc0Np/hDHEYd/KCQs7N0AU3Xkw5B4rJWPDFkBNNZuAGNPGW7JJLK7sVtloXKSAbUS GsjlCQpbQYItiBMXcTIDWjfRR+vPRPE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1769151190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rV01kHooLadLBYTqNh94OxZ2PfqtzMHrFv9yp7Y7yLs=; b=2jAD4uRN7yVkafUmQ6e2z6pai0Yh+Rxx1wcA+rdjuphJZY6YSliIksLbgBtXQJl+yDpbh3 VBY6BcesMhKRusBQ== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1769151190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rV01kHooLadLBYTqNh94OxZ2PfqtzMHrFv9yp7Y7yLs=; b=AenYz2WqR3hKBS9aBpv7ivOXMmyu4wNyqUZbqps1NSt+kGHN+Hg4CM8nbtrz/G0/EMGZ0S vRpMKgc0Np/hDHEYd/KCQs7N0AU3Xkw5B4rJWPDFkBNNZuAGNPGW7JJLK7sVtloXKSAbUS GsjlCQpbQYItiBMXcTIDWjfRR+vPRPE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1769151190; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rV01kHooLadLBYTqNh94OxZ2PfqtzMHrFv9yp7Y7yLs=; b=2jAD4uRN7yVkafUmQ6e2z6pai0Yh+Rxx1wcA+rdjuphJZY6YSliIksLbgBtXQJl+yDpbh3 VBY6BcesMhKRusBQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E6A43139EF; Fri, 23 Jan 2026 06:53:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id sH8GONUac2k4YgAAD6G6ig (envelope-from ); Fri, 23 Jan 2026 06:53:09 +0000 From: Vlastimil Babka Date: Fri, 23 Jan 2026 07:52:46 +0100 Subject: [PATCH v4 08/22] slab: make percpu sheaves compatible with kmalloc_nolock()/kfree_nolock() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260123-sheaves-for-all-v4-8-041323d506f7@suse.cz> References: <20260123-sheaves-for-all-v4-0-041323d506f7@suse.cz> In-Reply-To: <20260123-sheaves-for-all-v4-0-041323d506f7@suse.cz> To: Harry Yoo , Petr Tesarik , Christoph Lameter , David Rientjes , Roman Gushchin Cc: Hao Li , Andrew Morton , Uladzislau Rezki , "Liam R. Howlett" , Suren Baghdasaryan , Sebastian Andrzej Siewior , Alexei Starovoitov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-rt-devel@lists.linux.dev, bpf@vger.kernel.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.14.3 X-Spam-Score: -8.30 X-Spamd-Result: default: False [-8.30 / 50.00]; REPLY(-4.00)[]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; RCPT_COUNT_TWELVE(0.00)[18]; ARC_NA(0.00)[]; RCVD_TLS_ALL(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; R_RATELIMIT(0.00)[to_ip_from(RLwn5r54y1cp81no5tmbbew5oc)]; FROM_HAS_DN(0.00)[]; FREEMAIL_CC(0.00)[linux.dev,linux-foundation.org,gmail.com,oracle.com,google.com,linutronix.de,kernel.org,kvack.org,vger.kernel.org,lists.linux.dev,googlegroups.com,suse.cz]; MID_RHS_MATCH_FROM(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:helo,suse.cz:mid,suse.cz:email,oracle.com:email] X-Spam-Level: X-Spam-Flag: NO Before we enable percpu sheaves for kmalloc caches, we need to make sure kmalloc_nolock() and kfree_nolock() will continue working properly and not spin when not allowed to. Percpu sheaves themselves use local_trylock() so they are already compatible. We just need to be careful with the barn->lock spin_lock. Pass a new allow_spin parameter where necessary to use spin_trylock_irqsave(). In kmalloc_nolock_noprof() we can now attempt alloc_from_pcs() safely, for now it will always fail until we enable sheaves for kmalloc caches next. Similarly in kfree_nolock() we can attempt free_to_pcs(). Reviewed-by: Suren Baghdasaryan Reviewed-by: Harry Yoo Reviewed-by: Hao Li Signed-off-by: Vlastimil Babka Acked-by: Alexei Starovoitov Reviewed-by: Liam R. Howlett --- mm/slub.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++-------------= ---- 1 file changed, 60 insertions(+), 22 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 41e1bf35707c..4ca6bd944854 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2889,7 +2889,8 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 -static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn) +static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, + bool allow_spin) { struct slab_sheaf *empty =3D NULL; unsigned long flags; @@ -2897,7 +2898,10 @@ static struct slab_sheaf *barn_get_empty_sheaf(struc= t node_barn *barn) if (!data_race(barn->nr_empty)) return NULL; =20 - spin_lock_irqsave(&barn->lock, flags); + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; =20 if (likely(barn->nr_empty)) { empty =3D list_first_entry(&barn->sheaves_empty, @@ -2974,7 +2978,8 @@ static struct slab_sheaf *barn_get_full_or_empty_shea= f(struct node_barn *barn) * change. */ static struct slab_sheaf * -barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty) +barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty, + bool allow_spin) { struct slab_sheaf *full =3D NULL; unsigned long flags; @@ -2982,7 +2987,10 @@ barn_replace_empty_sheaf(struct node_barn *barn, str= uct slab_sheaf *empty) if (!data_race(barn->nr_full)) return NULL; =20 - spin_lock_irqsave(&barn->lock, flags); + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; =20 if (likely(barn->nr_full)) { full =3D list_first_entry(&barn->sheaves_full, struct slab_sheaf, @@ -3003,7 +3011,8 @@ barn_replace_empty_sheaf(struct node_barn *barn, stru= ct slab_sheaf *empty) * barn. But if there are too many full sheaves, reject this with -E2BIG. */ static struct slab_sheaf * -barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full) +barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full, + bool allow_spin) { struct slab_sheaf *empty; unsigned long flags; @@ -3014,7 +3023,10 @@ barn_replace_full_sheaf(struct node_barn *barn, stru= ct slab_sheaf *full) if (!data_race(barn->nr_empty)) return ERR_PTR(-ENOMEM); =20 - spin_lock_irqsave(&barn->lock, flags); + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return ERR_PTR(-EBUSY); =20 if (likely(barn->nr_empty)) { empty =3D list_first_entry(&barn->sheaves_empty, struct slab_sheaf, @@ -5008,7 +5020,8 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main); + full =3D barn_replace_empty_sheaf(barn, pcs->main, + gfpflags_allow_spinning(gfp)); =20 if (full) { stat(s, BARN_GET); @@ -5025,7 +5038,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, empty =3D pcs->spare; pcs->spare =3D NULL; } else { - empty =3D barn_get_empty_sheaf(barn); + empty =3D barn_get_empty_sheaf(barn, true); } } =20 @@ -5165,7 +5178,8 @@ void *alloc_from_pcs(struct kmem_cache *s, gfp_t gfp,= int node) } =20 static __fastpath_inline -unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, size_t size, void *= *p) +unsigned int alloc_from_pcs_bulk(struct kmem_cache *s, gfp_t gfp, size_t s= ize, + void **p) { struct slub_percpu_sheaves *pcs; struct slab_sheaf *main; @@ -5199,7 +5213,8 @@ unsigned int alloc_from_pcs_bulk(struct kmem_cache *s= , size_t size, void **p) return allocated; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main); + full =3D barn_replace_empty_sheaf(barn, pcs->main, + gfpflags_allow_spinning(gfp)); =20 if (full) { stat(s, BARN_GET); @@ -5700,7 +5715,7 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_fl= ags, int node) gfp_t alloc_gfp =3D __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags; struct kmem_cache *s; bool can_retry =3D true; - void *ret =3D ERR_PTR(-EBUSY); + void *ret; =20 VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO | __GFP_NO_OBJ_EXT)); @@ -5731,6 +5746,12 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_f= lags, int node) */ return NULL; =20 + ret =3D alloc_from_pcs(s, alloc_gfp, node); + if (ret) + goto success; + + ret =3D ERR_PTR(-EBUSY); + /* * Do not call slab_alloc_node(), since trylock mode isn't * compatible with slab_pre_alloc_hook/should_failslab and @@ -5767,6 +5788,7 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_fl= ags, int node) ret =3D NULL; } =20 +success: maybe_wipe_obj_freeptr(s, ret); slab_post_alloc_hook(s, NULL, alloc_gfp, 1, &ret, slab_want_init_on_alloc(alloc_gfp, s), size); @@ -6087,7 +6109,8 @@ static void __pcs_install_empty_sheaf(struct kmem_cac= he *s, * unlocked. */ static struct slub_percpu_sheaves * -__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *= pcs) +__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *= pcs, + bool allow_spin) { struct slab_sheaf *empty; struct node_barn *barn; @@ -6111,7 +6134,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct = slub_percpu_sheaves *pcs) put_fail =3D false; =20 if (!pcs->spare) { - empty =3D barn_get_empty_sheaf(barn); + empty =3D barn_get_empty_sheaf(barn, allow_spin); if (empty) { pcs->spare =3D pcs->main; pcs->main =3D empty; @@ -6125,7 +6148,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct = slub_percpu_sheaves *pcs) return pcs; } =20 - empty =3D barn_replace_full_sheaf(barn, pcs->main); + empty =3D barn_replace_full_sheaf(barn, pcs->main, allow_spin); =20 if (!IS_ERR(empty)) { stat(s, BARN_PUT); @@ -6133,7 +6156,8 @@ __pcs_replace_full_main(struct kmem_cache *s, struct = slub_percpu_sheaves *pcs) return pcs; } =20 - if (PTR_ERR(empty) =3D=3D -E2BIG) { + /* sheaf_flush_unused() doesn't support !allow_spin */ + if (PTR_ERR(empty) =3D=3D -E2BIG && allow_spin) { /* Since we got here, spare exists and is full */ struct slab_sheaf *to_flush =3D pcs->spare; =20 @@ -6158,6 +6182,14 @@ __pcs_replace_full_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs) alloc_empty: local_unlock(&s->cpu_sheaves->lock); =20 + /* + * alloc_empty_sheaf() doesn't support !allow_spin and it's + * easier to fall back to freeing directly without sheaves + * than add the support (and to sheaf_flush_unused() above) + */ + if (!allow_spin) + return NULL; + empty =3D alloc_empty_sheaf(s, GFP_NOWAIT); if (empty) goto got_empty; @@ -6200,7 +6232,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct = slub_percpu_sheaves *pcs) * The object is expected to have passed slab_free_hook() already. */ static __fastpath_inline -bool free_to_pcs(struct kmem_cache *s, void *object) +bool free_to_pcs(struct kmem_cache *s, void *object, bool allow_spin) { struct slub_percpu_sheaves *pcs; =20 @@ -6211,7 +6243,7 @@ bool free_to_pcs(struct kmem_cache *s, void *object) =20 if (unlikely(pcs->main->size =3D=3D s->sheaf_capacity)) { =20 - pcs =3D __pcs_replace_full_main(s, pcs); + pcs =3D __pcs_replace_full_main(s, pcs, allow_spin); if (unlikely(!pcs)) return false; } @@ -6333,7 +6365,7 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *ob= j) goto fail; } =20 - empty =3D barn_get_empty_sheaf(barn); + empty =3D barn_get_empty_sheaf(barn, true); =20 if (empty) { pcs->rcu_free =3D empty; @@ -6453,7 +6485,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, si= ze_t size, void **p) goto no_empty; =20 if (!pcs->spare) { - empty =3D barn_get_empty_sheaf(barn); + empty =3D barn_get_empty_sheaf(barn, true); if (!empty) goto no_empty; =20 @@ -6467,7 +6499,7 @@ static void free_to_pcs_bulk(struct kmem_cache *s, si= ze_t size, void **p) goto do_free; } =20 - empty =3D barn_replace_full_sheaf(barn, pcs->main); + empty =3D barn_replace_full_sheaf(barn, pcs->main, true); if (IS_ERR(empty)) { stat(s, BARN_PUT_FAIL); goto no_empty; @@ -6719,7 +6751,7 @@ void slab_free(struct kmem_cache *s, struct slab *sla= b, void *object, =20 if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) =3D=3D numa_mem_id(= )) && likely(!slab_test_pfmemalloc(slab))) { - if (likely(free_to_pcs(s, object))) + if (likely(free_to_pcs(s, object, true))) return; } =20 @@ -6980,6 +7012,12 @@ void kfree_nolock(const void *object) * since kasan quarantine takes locks and not supported from NMI. */ kasan_slab_free(s, x, false, false, /* skip quarantine */true); + + if (likely(!IS_ENABLED(CONFIG_NUMA) || slab_nid(slab) =3D=3D numa_mem_id(= ))) { + if (likely(free_to_pcs(s, x, false))) + return; + } + do_slab_free(s, slab, x, x, 0, _RET_IP_); } EXPORT_SYMBOL_GPL(kfree_nolock); @@ -7532,7 +7570,7 @@ int kmem_cache_alloc_bulk_noprof(struct kmem_cache *s= , gfp_t flags, size_t size, size--; } =20 - i =3D alloc_from_pcs_bulk(s, size, p); + i =3D alloc_from_pcs_bulk(s, flags, size, p); =20 if (i < size) { /* --=20 2.52.0