From nobody Tue Feb 10 14:32:11 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4585D33DEF7 for ; Fri, 23 Jan 2026 06:53:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769151232; cv=none; b=ELU7TD2qUe+mWLOSboGZGNY1Agtm/7ghfPzksNhDA4G7HNmxEvOPbsnosY3jY95/Bv6W1ZVanHwznhV+860KhuNEz24qfWeEMJbI0yuTVn9hY8Km6YkhZTgyxW44tKTL1ty8OPYCPHqjRBC44TxtovsK7OK3PzK7FaHynfpIJPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769151232; c=relaxed/simple; bh=ztuJyh1YNpB2FjiFCXAMVEWg1zy/PccBOyDvvyqtBjQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=WyDIhV+9E+a5qv3DQPHiF1kgiUGll+OTmfKLyTJTsGAIm/XI3WUT4hQSFIADtlM1uv02T9JaECdfUOHTkcgcIh69hfAYm0GdjLKFwPEuS2PtX1XWdr1WfrI/yrGGZmVrSENl/EzLZ1zH0laStGCCDy8sJV50viqGmwGqzVW3vz4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D7A6733778; Fri, 23 Jan 2026 06:53:10 +0000 (UTC) Authentication-Results: smtp-out1.suse.de; none Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 62858139F3; Fri, 23 Jan 2026 06:53:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id CDDJF9Yac2k4YgAAD6G6ig (envelope-from ); Fri, 23 Jan 2026 06:53:10 +0000 From: Vlastimil Babka Date: Fri, 23 Jan 2026 07:52:50 +0100 Subject: [PATCH v4 12/22] slab: remove SLUB_CPU_PARTIAL Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260123-sheaves-for-all-v4-12-041323d506f7@suse.cz> References: <20260123-sheaves-for-all-v4-0-041323d506f7@suse.cz> In-Reply-To: <20260123-sheaves-for-all-v4-0-041323d506f7@suse.cz> To: Harry Yoo , Petr Tesarik , Christoph Lameter , David Rientjes , Roman Gushchin Cc: Hao Li , Andrew Morton , Uladzislau Rezki , "Liam R. Howlett" , Suren Baghdasaryan , Sebastian Andrzej Siewior , Alexei Starovoitov , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-rt-devel@lists.linux.dev, bpf@vger.kernel.org, kasan-dev@googlegroups.com, Vlastimil Babka X-Mailer: b4 0.14.3 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 50.00]; REPLY(-4.00)[]; R_RATELIMIT(0.00)[to_ip_from(RLfsjnp7neds983g95ihcnuzgq)] X-Spam-Flag: NO X-Spam-Score: -4.00 X-Rspamd-Queue-Id: D7A6733778 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Action: no action X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spam-Level: We have removed the partial slab usage from allocation paths. Now remove the whole config option and associated code. Reviewed-by: Harry Yoo Reviewed-by: Hao Li Reviewed-by: Suren Baghdasaryan Signed-off-by: Vlastimil Babka --- mm/Kconfig | 11 --- mm/slab.h | 29 ------ mm/slub.c | 321 ++++-----------------------------------------------------= ---- 3 files changed, 19 insertions(+), 342 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index bd0ea5454af8..08593674cd20 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -247,17 +247,6 @@ config SLUB_STATS out which slabs are relevant to a particular load. Try running: slabinfo -DA =20 -config SLUB_CPU_PARTIAL - default y - depends on SMP && !SLUB_TINY - bool "Enable per cpu partial caches" - help - Per cpu partial caches accelerate objects allocation and freeing - that is local to a processor at the price of more indeterminism - in the latency of the free. On overflow these caches will be cleared - which requires the taking of locks that may cause latency spikes. - Typically one would choose no for a realtime system. - config RANDOM_KMALLOC_CACHES default n depends on !SLUB_TINY diff --git a/mm/slab.h b/mm/slab.h index a20a6af6e0ef..0fbe13bec864 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -77,12 +77,6 @@ struct slab { struct llist_node llnode; void *flush_freelist; }; -#ifdef CONFIG_SLUB_CPU_PARTIAL - struct { - struct slab *next; - int slabs; /* Nr of slabs left */ - }; -#endif }; /* Double-word boundary */ struct freelist_counters; @@ -188,23 +182,6 @@ static inline size_t slab_size(const struct slab *slab) return PAGE_SIZE << slab_order(slab); } =20 -#ifdef CONFIG_SLUB_CPU_PARTIAL -#define slub_percpu_partial(c) ((c)->partial) - -#define slub_set_percpu_partial(c, p) \ -({ \ - slub_percpu_partial(c) =3D (p)->next; \ -}) - -#define slub_percpu_partial_read_once(c) READ_ONCE(slub_percpu_partial(c)) -#else -#define slub_percpu_partial(c) NULL - -#define slub_set_percpu_partial(c, p) - -#define slub_percpu_partial_read_once(c) NULL -#endif // CONFIG_SLUB_CPU_PARTIAL - /* * Word size structure that can be atomically updated or read and that * contains both the order and the number of objects that a slab of the @@ -228,12 +205,6 @@ struct kmem_cache { unsigned int object_size; /* Object size without metadata */ struct reciprocal_value reciprocal_size; unsigned int offset; /* Free pointer offset */ -#ifdef CONFIG_SLUB_CPU_PARTIAL - /* Number of per cpu partial objects to keep around */ - unsigned int cpu_partial; - /* Number of per cpu partial slabs to keep around */ - unsigned int cpu_partial_slabs; -#endif unsigned int sheaf_capacity; struct kmem_cache_order_objects oo; =20 diff --git a/mm/slub.c b/mm/slub.c index 3a78cee811cf..914b51aedb25 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -268,15 +268,6 @@ void *fixup_red_left(struct kmem_cache *s, void *p) return p; } =20 -static inline bool kmem_cache_has_cpu_partial(struct kmem_cache *s) -{ -#ifdef CONFIG_SLUB_CPU_PARTIAL - return !kmem_cache_debug(s); -#else - return false; -#endif -} - /* * Issues still to be resolved: * @@ -431,9 +422,6 @@ struct freelist_tid { struct kmem_cache_cpu { struct freelist_tid; struct slab *slab; /* The slab from which we are allocating */ -#ifdef CONFIG_SLUB_CPU_PARTIAL - struct slab *partial; /* Partially allocated slabs */ -#endif local_trylock_t lock; /* Protects the fields above */ #ifdef CONFIG_SLUB_STATS unsigned int stat[NR_SLUB_STAT_ITEMS]; @@ -666,29 +654,6 @@ static inline unsigned int oo_objects(struct kmem_cach= e_order_objects x) return x.x & OO_MASK; } =20 -#ifdef CONFIG_SLUB_CPU_PARTIAL -static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_obj= ects) -{ - unsigned int nr_slabs; - - s->cpu_partial =3D nr_objects; - - /* - * We take the number of objects but actually limit the number of - * slabs on the per cpu partial list, in order to limit excessive - * growth of the list. For simplicity we assume that the slabs will - * be half-full. - */ - nr_slabs =3D DIV_ROUND_UP(nr_objects * 2, oo_objects(s->oo)); - s->cpu_partial_slabs =3D nr_slabs; -} -#elif defined(SLAB_SUPPORTS_SYSFS) -static inline void -slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) -{ -} -#endif /* CONFIG_SLUB_CPU_PARTIAL */ - /* * If network-based swap is enabled, slub must keep track of whether memory * were allocated from pfmemalloc reserves. @@ -3476,12 +3441,6 @@ static void *alloc_single_from_new_slab(struct kmem_= cache *s, struct slab *slab, return object; } =20 -#ifdef CONFIG_SLUB_CPU_PARTIAL -static void put_cpu_partial(struct kmem_cache *s, struct slab *slab, int d= rain); -#else -static inline void put_cpu_partial(struct kmem_cache *s, struct slab *slab, - int drain) { } -#endif static inline bool pfmemalloc_match(struct slab *slab, gfp_t gfpflags); =20 static bool get_partial_node_bulk(struct kmem_cache *s, @@ -3894,131 +3853,6 @@ static void deactivate_slab(struct kmem_cache *s, s= truct slab *slab, #define local_unlock_cpu_slab(s, flags) \ local_unlock_irqrestore(&(s)->cpu_slab->lock, flags) =20 -#ifdef CONFIG_SLUB_CPU_PARTIAL -static void __put_partials(struct kmem_cache *s, struct slab *partial_slab) -{ - struct kmem_cache_node *n =3D NULL, *n2 =3D NULL; - struct slab *slab, *slab_to_discard =3D NULL; - unsigned long flags =3D 0; - - while (partial_slab) { - slab =3D partial_slab; - partial_slab =3D slab->next; - - n2 =3D get_node(s, slab_nid(slab)); - if (n !=3D n2) { - if (n) - spin_unlock_irqrestore(&n->list_lock, flags); - - n =3D n2; - spin_lock_irqsave(&n->list_lock, flags); - } - - if (unlikely(!slab->inuse && n->nr_partial >=3D s->min_partial)) { - slab->next =3D slab_to_discard; - slab_to_discard =3D slab; - } else { - add_partial(n, slab, DEACTIVATE_TO_TAIL); - stat(s, FREE_ADD_PARTIAL); - } - } - - if (n) - spin_unlock_irqrestore(&n->list_lock, flags); - - while (slab_to_discard) { - slab =3D slab_to_discard; - slab_to_discard =3D slab_to_discard->next; - - stat(s, DEACTIVATE_EMPTY); - discard_slab(s, slab); - stat(s, FREE_SLAB); - } -} - -/* - * Put all the cpu partial slabs to the node partial list. - */ -static void put_partials(struct kmem_cache *s) -{ - struct slab *partial_slab; - unsigned long flags; - - local_lock_irqsave(&s->cpu_slab->lock, flags); - partial_slab =3D this_cpu_read(s->cpu_slab->partial); - this_cpu_write(s->cpu_slab->partial, NULL); - local_unlock_irqrestore(&s->cpu_slab->lock, flags); - - if (partial_slab) - __put_partials(s, partial_slab); -} - -static void put_partials_cpu(struct kmem_cache *s, - struct kmem_cache_cpu *c) -{ - struct slab *partial_slab; - - partial_slab =3D slub_percpu_partial(c); - c->partial =3D NULL; - - if (partial_slab) - __put_partials(s, partial_slab); -} - -/* - * Put a slab into a partial slab slot if available. - * - * If we did not find a slot then simply move all the partials to the - * per node partial list. - */ -static void put_cpu_partial(struct kmem_cache *s, struct slab *slab, int d= rain) -{ - struct slab *oldslab; - struct slab *slab_to_put =3D NULL; - unsigned long flags; - int slabs =3D 0; - - local_lock_cpu_slab(s, flags); - - oldslab =3D this_cpu_read(s->cpu_slab->partial); - - if (oldslab) { - if (drain && oldslab->slabs >=3D s->cpu_partial_slabs) { - /* - * Partial array is full. Move the existing set to the - * per node partial list. Postpone the actual unfreezing - * outside of the critical section. - */ - slab_to_put =3D oldslab; - oldslab =3D NULL; - } else { - slabs =3D oldslab->slabs; - } - } - - slabs++; - - slab->slabs =3D slabs; - slab->next =3D oldslab; - - this_cpu_write(s->cpu_slab->partial, slab); - - local_unlock_cpu_slab(s, flags); - - if (slab_to_put) { - __put_partials(s, slab_to_put); - stat(s, CPU_PARTIAL_DRAIN); - } -} - -#else /* CONFIG_SLUB_CPU_PARTIAL */ - -static inline void put_partials(struct kmem_cache *s) { } -static inline void put_partials_cpu(struct kmem_cache *s, - struct kmem_cache_cpu *c) { } - -#endif /* CONFIG_SLUB_CPU_PARTIAL */ - static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu = *c) { unsigned long flags; @@ -4056,8 +3890,6 @@ static inline void __flush_cpu_slab(struct kmem_cache= *s, int cpu) deactivate_slab(s, slab, freelist); stat(s, CPUSLAB_FLUSH); } - - put_partials_cpu(s, c); } =20 static inline void flush_this_cpu_slab(struct kmem_cache *s) @@ -4066,15 +3898,13 @@ static inline void flush_this_cpu_slab(struct kmem_= cache *s) =20 if (c->slab) flush_slab(s, c); - - put_partials(s); } =20 static bool has_cpu_slab(int cpu, struct kmem_cache *s) { struct kmem_cache_cpu *c =3D per_cpu_ptr(s->cpu_slab, cpu); =20 - return c->slab || slub_percpu_partial(c); + return c->slab; } =20 static bool has_pcs_used(int cpu, struct kmem_cache *s) @@ -5652,13 +5482,6 @@ static void __slab_free(struct kmem_cache *s, struct= slab *slab, return; } =20 - /* - * It is enough to test IS_ENABLED(CONFIG_SLUB_CPU_PARTIAL) below - * instead of kmem_cache_has_cpu_partial(s), because kmem_cache_debug(s) - * is the only other reason it can be false, and it is already handled - * above. - */ - do { if (unlikely(n)) { spin_unlock_irqrestore(&n->list_lock, flags); @@ -5683,26 +5506,19 @@ static void __slab_free(struct kmem_cache *s, struc= t slab *slab, * Unless it's frozen. */ if ((!new.inuse || was_full) && !was_frozen) { + + n =3D get_node(s, slab_nid(slab)); /* - * If slab becomes non-full and we have cpu partial - * lists, we put it there unconditionally to avoid - * taking the list_lock. Otherwise we need it. + * Speculatively acquire the list_lock. + * If the cmpxchg does not succeed then we may + * drop the list_lock without any processing. + * + * Otherwise the list_lock will synchronize with + * other processors updating the list of slabs. */ - if (!(IS_ENABLED(CONFIG_SLUB_CPU_PARTIAL) && was_full)) { - - n =3D get_node(s, slab_nid(slab)); - /* - * Speculatively acquire the list_lock. - * If the cmpxchg does not succeed then we may - * drop the list_lock without any processing. - * - * Otherwise the list_lock will synchronize with - * other processors updating the list of slabs. - */ - spin_lock_irqsave(&n->list_lock, flags); - - on_node_partial =3D slab_test_node_partial(slab); - } + spin_lock_irqsave(&n->list_lock, flags); + + on_node_partial =3D slab_test_node_partial(slab); } =20 } while (!slab_update_freelist(s, slab, &old, &new, "__slab_free")); @@ -5715,13 +5531,6 @@ static void __slab_free(struct kmem_cache *s, struct= slab *slab, * activity can be necessary. */ stat(s, FREE_FROZEN); - } else if (IS_ENABLED(CONFIG_SLUB_CPU_PARTIAL) && was_full) { - /* - * If we started with a full slab then put it onto the - * per cpu partial list. - */ - put_cpu_partial(s, slab, 1); - stat(s, CPU_PARTIAL_FREE); } =20 /* @@ -5750,10 +5559,9 @@ static void __slab_free(struct kmem_cache *s, struct= slab *slab, =20 /* * Objects left in the slab. If it was not on the partial list before - * then add it. This can only happen when cache has no per cpu partial - * list otherwise we would have put it there. + * then add it. */ - if (!IS_ENABLED(CONFIG_SLUB_CPU_PARTIAL) && unlikely(was_full)) { + if (unlikely(was_full)) { add_partial(n, slab, DEACTIVATE_TO_TAIL); stat(s, FREE_ADD_PARTIAL); } @@ -6419,8 +6227,8 @@ static __always_inline void do_slab_free(struct kmem_= cache *s, if (unlikely(!allow_spin)) { /* * __slab_free() can locklessly cmpxchg16 into a slab, - * but then it might need to take spin_lock or local_lock - * in put_cpu_partial() for further processing. + * but then it might need to take spin_lock + * for further processing. * Avoid the complexity and simply add to a deferred list. */ defer_free(s, head); @@ -7734,39 +7542,6 @@ static int init_kmem_cache_nodes(struct kmem_cache *= s) return 1; } =20 -static void set_cpu_partial(struct kmem_cache *s) -{ -#ifdef CONFIG_SLUB_CPU_PARTIAL - unsigned int nr_objects; - - /* - * cpu_partial determined the maximum number of objects kept in the - * per cpu partial lists of a processor. - * - * Per cpu partial lists mainly contain slabs that just have one - * object freed. If they are used for allocation then they can be - * filled up again with minimal effort. The slab will never hit the - * per node partial lists and therefore no locking will be required. - * - * For backwards compatibility reasons, this is determined as number - * of objects, even though we now limit maximum number of pages, see - * slub_set_cpu_partial() - */ - if (!kmem_cache_has_cpu_partial(s)) - nr_objects =3D 0; - else if (s->size >=3D PAGE_SIZE) - nr_objects =3D 6; - else if (s->size >=3D 1024) - nr_objects =3D 24; - else if (s->size >=3D 256) - nr_objects =3D 52; - else - nr_objects =3D 120; - - slub_set_cpu_partial(s, nr_objects); -#endif -} - static unsigned int calculate_sheaf_capacity(struct kmem_cache *s, struct kmem_cache_args *args) =20 @@ -8627,8 +8402,6 @@ int do_kmem_cache_create(struct kmem_cache *s, const = char *name, s->min_partial =3D min_t(unsigned long, MAX_PARTIAL, ilog2(s->size) / 2); s->min_partial =3D max_t(unsigned long, MIN_PARTIAL, s->min_partial); =20 - set_cpu_partial(s); - s->cpu_sheaves =3D alloc_percpu(struct slub_percpu_sheaves); if (!s->cpu_sheaves) { err =3D -ENOMEM; @@ -8992,20 +8765,6 @@ static ssize_t show_slab_objects(struct kmem_cache *= s, total +=3D x; nodes[node] +=3D x; =20 -#ifdef CONFIG_SLUB_CPU_PARTIAL - slab =3D slub_percpu_partial_read_once(c); - if (slab) { - node =3D slab_nid(slab); - if (flags & SO_TOTAL) - WARN_ON_ONCE(1); - else if (flags & SO_OBJECTS) - WARN_ON_ONCE(1); - else - x =3D data_race(slab->slabs); - total +=3D x; - nodes[node] +=3D x; - } -#endif } } =20 @@ -9140,12 +8899,7 @@ SLAB_ATTR(min_partial); =20 static ssize_t cpu_partial_show(struct kmem_cache *s, char *buf) { - unsigned int nr_partial =3D 0; -#ifdef CONFIG_SLUB_CPU_PARTIAL - nr_partial =3D s->cpu_partial; -#endif - - return sysfs_emit(buf, "%u\n", nr_partial); + return sysfs_emit(buf, "0\n"); } =20 static ssize_t cpu_partial_store(struct kmem_cache *s, const char *buf, @@ -9157,11 +8911,9 @@ static ssize_t cpu_partial_store(struct kmem_cache *= s, const char *buf, err =3D kstrtouint(buf, 10, &objects); if (err) return err; - if (objects && !kmem_cache_has_cpu_partial(s)) + if (objects) return -EINVAL; =20 - slub_set_cpu_partial(s, objects); - flush_all(s); return length; } SLAB_ATTR(cpu_partial); @@ -9200,42 +8952,7 @@ SLAB_ATTR_RO(objects_partial); =20 static ssize_t slabs_cpu_partial_show(struct kmem_cache *s, char *buf) { - int objects =3D 0; - int slabs =3D 0; - int cpu __maybe_unused; - int len =3D 0; - -#ifdef CONFIG_SLUB_CPU_PARTIAL - for_each_online_cpu(cpu) { - struct slab *slab; - - slab =3D slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu)); - - if (slab) - slabs +=3D data_race(slab->slabs); - } -#endif - - /* Approximate half-full slabs, see slub_set_cpu_partial() */ - objects =3D (slabs * oo_objects(s->oo)) / 2; - len +=3D sysfs_emit_at(buf, len, "%d(%d)", objects, slabs); - -#ifdef CONFIG_SLUB_CPU_PARTIAL - for_each_online_cpu(cpu) { - struct slab *slab; - - slab =3D slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu)); - if (slab) { - slabs =3D data_race(slab->slabs); - objects =3D (slabs * oo_objects(s->oo)) / 2; - len +=3D sysfs_emit_at(buf, len, " C%d=3D%d(%d)", - cpu, objects, slabs); - } - } -#endif - len +=3D sysfs_emit_at(buf, len, "\n"); - - return len; + return sysfs_emit(buf, "0(0)\n"); } SLAB_ATTR_RO(slabs_cpu_partial); =20 --=20 2.52.0