From nobody Sat Mar 14 08:12:38 2026 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73EF8306B3F for ; Tue, 2 Dec 2025 08:48:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.196 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764665316; cv=none; b=EJ07JjfQJLQ92+TZy0GQBEaU24lhkdjZ791bypqRIf/X/qps3glxq8l4ivh4g34UcyEEmwxJ2ZydsEa7I5vKMWUTjeWfxHc41O2LA5W1iDXk2aUcWtHP+AP8B+1i8tYOXSF3VFSy7aNAKLOTMVOd9S577sOE75+6QuXSFxb0LKA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764665316; c=relaxed/simple; bh=RQaKqQm9FHtUQg9iabofaQA0brFzM9Skzz+R/aRupC8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=qlAk3c8SKj2We+vTp2bi1Blf3IY0AUCLn1Mh2twVBJvhlE82/Q56sdw97xfBC3RpDKwkGHDp8oBGSBesTrd6ZecrLxXwlOJzpYswNLfr95UhX4emM+LMPeSlVx2hdiQ0wb1NqpE/Kvd8ZQseqn4kOYKEqa3I7SdiY9n4nsmXxXw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=NHFqOHlS; arc=none smtp.client-ip=209.85.210.196 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="NHFqOHlS" Received: by mail-pf1-f196.google.com with SMTP id d2e1a72fcca58-7b9215e55e6so3572554b3a.2 for ; Tue, 02 Dec 2025 00:48:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764665311; x=1765270111; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:mutt-fcc :mutt-references:message-id:subject:cc:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=OBS8+olM/4M+ACTFgRPt5Ss6w8SxO0WWkLSTJ+Iinlg=; b=NHFqOHlS353vsjiCBruJOm+OaeS7noU0ux8+I+8/EiHV9R4QBEwMzjmQcsFxXYugkv kEJ8J//RA/06jNNd7uhlhggs8aAkdqiKCylk/FAwOJktiYohkIiWFVdj9Z+jxOljnSTX 9RAFMtCh8idLT2oWNViE7LwuxOB6dtW+WUewhPnafzJcf+0HnOuzXSreHa7KKWtG/rsZ bXpDppNXUQkBjMBIgMx9eRM+JBvy0uD6UYrNvWnVglqHVlQsbFjcNbxlgAjojO64hNxr IsRpxcmmAYkG2hvLro0pOfC+vBZcurJ3PyjfaGE/1HqacvQzce4J7iJf9/379fezf+8X s2kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764665311; x=1765270111; h=in-reply-to:content-disposition:mime-version:references:mutt-fcc :mutt-references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OBS8+olM/4M+ACTFgRPt5Ss6w8SxO0WWkLSTJ+Iinlg=; b=mGt26RHCP2UAQfMXquObT1FruCmZb9Ev/Zbdjp/9eAAhOXQj481awlvomtUKoKZSXY tMECzfN7qrPCXoLadM6ZrkgiWxucCfj7AY9RNSwhZytAsFAoOP7g4e+cHU7UaMVOLzRO v+QmXMurdTpZSCL7skR5AxGJWlaNGr8qVskoO8nt0E/2k8ZwDuL3PYdNi+NnfSLXTUDe ZGaaGKgDTsO82K7ea2WqzpLZFSEj4ZA0G90vPOeWUxf/TNU1Ff+TvdEPaw0TeyAMThYI FYy757kTSXqiawfwpR6E5tXtr+qhZpmyUD/t0MXJ/pS5YcfSFX9cNtv9Ok7KqVjIjfcV PyQQ== X-Forwarded-Encrypted: i=1; AJvYcCVK5dSftF5z8u0mp3Q+foyAg1rhbP7F+Ovp03encBUyWZSXO0yaUIs1dxRaHRiccXpUAstXFVfFMGiWm+w=@vger.kernel.org X-Gm-Message-State: AOJu0YzZzO78TKSIn/M/aohZdDriCxHnV++rAkSbgYPRI/bG8/PSGUHd UFaJDJnBMgB10qTLTNAkiK+XCUXtM7wz8zj6wzZoSocjuoPOpMarpheK X-Gm-Gg: ASbGncvStNUt0BFpWGdjiEGxlVyPqhxMYmNWwVPP6B7H9JjDzq8wrnjPwLWd1o0mBV3 Ua4uR5wJLdNEEU/DtlV9I7K/8IQWHMIBtt4IQJHKhQOBrcUU5gM6S9RU1dcAH/q994tK04K1sBA nMWa9bRcyiAoA1L1bTD6NkwbLpM0SuEjjeF4Bt9WDO+u+2szkhOp41gcXjWRWCWIP7VGNyLMM0K 6WqF5zl4tyNa9bAYekmcqVqm1CWWu6+spaSeeEh7ZNt9+bA8FALNy8NpxusB8BBM2NAtCg3F1vX IEkMI1vJRbU9p+RQnCB9yFTDMiMqcyFIWsd/NOiT2Z24KneyRdkoz4Gzm05wyucd8F5s9W0BhE4 o2efGzigjfsGYzyqEgCDNDImdbMcALzkBvVI3qflcqR5NzML7Z5QTyMCbmJQ1IKy5aDlA57r1cV 1czzc56cKREJslrRI= X-Google-Smtp-Source: AGHT+IHTH0JinbOevFQv63FRumPRo26JUUc0qkwllwN1bogjyVzWr4TpTK1y1HuLN31XKsWATKEW8g== X-Received: by 2002:a05:6a00:c83:b0:7b9:7f18:c716 with SMTP id d2e1a72fcca58-7ca8740ef7emr30483141b3a.1.1764665311073; Tue, 02 Dec 2025 00:48:31 -0800 (PST) Received: from fedora ([183.241.171.104]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7d15f080beasm16166803b3a.47.2025.12.02.00.48.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Dec 2025 00:48:30 -0800 (PST) Date: Tue, 2 Dec 2025 16:48:17 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan , "Liam R. Howlett" , Christoph Lameter , David Rientjes , Roman Gushchin , Harry Yoo , Uladzislau Rezki , Sidhartha Kumar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, maple-tree@lists.infradead.org, Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:33:21 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 8250 Lines: 265 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:44:16 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 17732 Lines: 557 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:33:21 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 8250 Lines: 265 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:45:09 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 36697 Lines: 1141 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:33:21 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 8250 Lines: 265 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:44:16 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 17732 Lines: 557 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:33:21 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 8250 Lines: 265 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1 From haoli.tcs@gmail.com Tue Dec 2 16:24:49 2025 Date: Tue, 2 Dec 2025 16:31:49 +0800 From: Hao Li To: Vlastimil Babka Cc: Suren Baghdasaryan ,=20 "Liam R. Howlett" , Christoph Lameter ,=20 David Rientjes , Roman Gushchin ,=20 Harry Yoo , Uladzislau Rezki ,=20 Sidhartha Kumar , linux-mm@kvack.org, linux-ke= rnel@vger.kernel.org,=20 rcu@vger.kernel.org, maple-tree@lists.infradead.org,=20 Venkat Rao Bagalkote Subject: [PATCH] slub: add barn_get_full_sheaf() and refine empty-main sheaf Message-ID: Mutt-References: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent References: <20250910-slub-percpu-caches-v8-0-ca3099d8352c@suse.cz> <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=3Dus-ascii Content-Disposition: inline In-Reply-To: <20250910-slub-percpu-caches-v8-3-ca3099d8352c@suse.cz> Mutt-Fcc: ~/sent Status: RO Content-Length: 3509 Lines: 119 Introduce barn_get_full_sheaf(), a helper that detaches a full sheaf from the per-node barn without requiring an empty sheaf in exchange. Use this helper in __pcs_replace_empty_main() to change how an empty main per-CPU sheaf is handled: - If pcs->spare is NULL and pcs->main is empty, first try to obtain a full sheaf from the barn via barn_get_full_sheaf(). On success, park the empty main sheaf in pcs->spare and install the full sheaf as the new pcs->main. - If pcs->spare already exists and has objects, keep the existing behavior of simply swapping pcs->main and pcs->spare. - Only when both pcs->main and pcs->spare are empty do we fall back to barn_replace_empty_sheaf() and trade the empty main sheaf into the barn in exchange for a full one. This makes the empty-main path more symmetric with __pcs_replace_full_main(= ), which for a full main sheaf parks the full sheaf in pcs->spare and pulls an empty sheaf from the barn. It also matches the documented design more close= ly: "When both percpu sheaves are found empty during an allocation, an empty sheaf may be replaced with a full one from the per-node barn." Signed-off-by: Hao Li --- * This patch is based on b4/sheaves-for-all branch mm/slub.c | 50 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index a94c64f56504..1fd28aa204e1 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2746,6 +2746,32 @@ static void pcs_destroy(struct kmem_cache *s) s->cpu_sheaves =3D NULL; } =20 +static struct slab_sheaf *barn_get_full_sheaf(struct node_barn *barn, + bool allow_spin) +{ + struct slab_sheaf *full =3D NULL; + unsigned long flags; + + if (!data_race(barn->nr_full)) + return NULL; + + if (likely(allow_spin)) + spin_lock_irqsave(&barn->lock, flags); + else if (!spin_trylock_irqsave(&barn->lock, flags)) + return NULL; + + if (likely(barn->nr_full)) { + full =3D list_first_entry(&barn->sheaves_full, + struct slab_sheaf, barn_list); + list_del(&full->barn_list); + barn->nr_full--; + } + + spin_unlock_irqrestore(&barn->lock, flags); + + return full; +} + static struct slab_sheaf *barn_get_empty_sheaf(struct node_barn *barn, bool allow_spin) { @@ -4120,7 +4146,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct= slub_percpu_sheaves *pcs, struct slab_sheaf *empty =3D NULL; struct slab_sheaf *full; struct node_barn *barn; - bool can_alloc; + bool can_alloc, allow_spin; =20 lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock)); =20 @@ -4130,10 +4156,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - if (pcs->spare && pcs->spare->size > 0) { - swap(pcs->main, pcs->spare); - return pcs; - } + allow_spin =3D gfpflags_allow_spinning(gfp); =20 barn =3D get_barn(s); if (!barn) { @@ -4141,8 +4164,21 @@ __pcs_replace_empty_main(struct kmem_cache *s, struc= t slub_percpu_sheaves *pcs, return NULL; } =20 - full =3D barn_replace_empty_sheaf(barn, pcs->main, - gfpflags_allow_spinning(gfp)); + if (!pcs->spare) { + full =3D barn_get_full_sheaf(barn, allow_spin); + if (full) { + pcs->spare =3D pcs->main; + pcs->main =3D full; + return pcs; + } + } else if (pcs->spare->size > 0) { + swap(pcs->main, pcs->spare); + return pcs; + } + + /* both main and spare are empty */ + + full =3D barn_replace_empty_sheaf(barn, pcs->main, allow_spin); =20 if (full) { stat(s, BARN_GET); --=20 2.50.1