From nobody Sat Feb 7 17:49:54 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3736C004C0 for ; Sat, 21 Oct 2023 14:44:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231415AbjJUOoB (ORCPT ); Sat, 21 Oct 2023 10:44:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229588AbjJUOoA (ORCPT ); Sat, 21 Oct 2023 10:44:00 -0400 Received: from out-196.mta1.migadu.com (out-196.mta1.migadu.com [IPv6:2001:41d0:203:375::c4]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACA95C0 for ; Sat, 21 Oct 2023 07:43:57 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697899435; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=McUlAOC6AXMF3DQxwjxFsXChYXBELSqYw7TfeM0+yv0=; b=U7c03tZf2fCoW5T3wE2WN887d5Ut/z0dqbsdzuHPFwfuTnJjKkXIqgu1qkz/ZM1D7dvN/0 +t0ZPZQ04YUwljOgsVlPl0zjwouHH5C+41xLj3a5pQygz/jzSeLcr3C5ImZ/LyIGOBMkuM Zx55Qopachdrboij0Lf1+nGLH57UfVI= From: chengming.zhou@linux.dev To: cl@linux.com, penberg@kernel.org Cc: rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chengming.zhou@linux.dev, Chengming Zhou Subject: [RFC PATCH v2 1/6] slub: Keep track of whether slub is on the per-node partial list Date: Sat, 21 Oct 2023 14:43:12 +0000 Message-Id: <20231021144317.3400916-2-chengming.zhou@linux.dev> In-Reply-To: <20231021144317.3400916-1-chengming.zhou@linux.dev> References: <20231021144317.3400916-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Now we rely on the "frozen" bit to see if we should manipulate the slab->slab_list, which will be changed in the following patch. Instead we introduce another way to keep track of whether slub is on the per-node partial list, here we reuse the PG_workingset bit. Signed-off-by: Chengming Zhou Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- include/linux/page-flags.h | 2 ++ mm/slab.h | 19 +++++++++++++++++++ mm/slub.c | 3 +++ 3 files changed, 24 insertions(+) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index a88e64acebfe..e8b1be71d722 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -478,6 +478,8 @@ PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Activ= e, active, PF_HEAD) TESTCLEARFLAG(Active, active, PF_HEAD) PAGEFLAG(Workingset, workingset, PF_HEAD) TESTCLEARFLAG(Workingset, workingset, PF_HEAD) + __SETPAGEFLAG(Workingset, workingset, PF_HEAD) + __CLEARPAGEFLAG(Workingset, workingset, PF_HEAD) __PAGEFLAG(Slab, slab, PF_NO_TAIL) PAGEFLAG(Checked, checked, PF_NO_COMPOUND) /* Used by some filesystems = */ =20 diff --git a/mm/slab.h b/mm/slab.h index 8cd3294fedf5..9cff64cae8de 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -193,6 +193,25 @@ static inline void __slab_clear_pfmemalloc(struct slab= *slab) __folio_clear_active(slab_folio(slab)); } =20 +/* + * Slub reuse PG_workingset bit to keep track of whether it's on + * the per-node partial list. + */ +static inline bool slab_test_node_partial(const struct slab *slab) +{ + return folio_test_workingset((struct folio *)slab_folio(slab)); +} + +static inline void slab_set_node_partial(struct slab *slab) +{ + __folio_set_workingset(slab_folio(slab)); +} + +static inline void slab_clear_node_partial(struct slab *slab) +{ + __folio_clear_workingset(slab_folio(slab)); +} + static inline void *slab_address(const struct slab *slab) { return folio_address(slab_folio(slab)); diff --git a/mm/slub.c b/mm/slub.c index 63d281dfacdb..3fad4edca34b 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2127,6 +2127,7 @@ __add_partial(struct kmem_cache_node *n, struct slab = *slab, int tail) list_add_tail(&slab->slab_list, &n->partial); else list_add(&slab->slab_list, &n->partial); + slab_set_node_partial(slab); } =20 static inline void add_partial(struct kmem_cache_node *n, @@ -2141,6 +2142,7 @@ static inline void remove_partial(struct kmem_cache_n= ode *n, { lockdep_assert_held(&n->list_lock); list_del(&slab->slab_list); + slab_clear_node_partial(slab); n->nr_partial--; } =20 @@ -4831,6 +4833,7 @@ static int __kmem_cache_do_shrink(struct kmem_cache *= s) =20 if (free =3D=3D slab->objects) { list_move(&slab->slab_list, &discard); + slab_clear_node_partial(slab); n->nr_partial--; dec_slabs_node(s, node, slab->objects); } else if (free <=3D SHRINK_PROMOTE_MAX) --=20 2.20.1 From nobody Sat Feb 7 17:49:54 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86AAAC001E0 for ; Sat, 21 Oct 2023 14:44:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231462AbjJUOoN (ORCPT ); Sat, 21 Oct 2023 10:44:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229588AbjJUOoK (ORCPT ); Sat, 21 Oct 2023 10:44:10 -0400 Received: from out-209.mta1.migadu.com (out-209.mta1.migadu.com [IPv6:2001:41d0:203:375::d1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F13E7D52 for ; Sat, 21 Oct 2023 07:44:04 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697899443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5cQE9ZFnILnFxpwPZY6R/Y63Il/vf0eHaCXoNa8ST4o=; b=xX8mHLDHA6jFLEhdUxuEf1tXL2PZcg10QlIE6EE+u+VIO1Ec/ICxJ/WjowvxWiwRTrQeqm rLm5/Uzh0gR0aIfDdIxK6teeg/FWiuTnzKUIMz43WCQ8Awhnrqp5HujqEJ+KKY2sWYUBDV RSgUvb/mgs0boYQF+AOgd4q17KMw5RI= From: chengming.zhou@linux.dev To: cl@linux.com, penberg@kernel.org Cc: rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chengming.zhou@linux.dev, Chengming Zhou Subject: [RFC PATCH v2 2/6] slub: Prepare __slab_free() for unfrozen partial slab out of node partial list Date: Sat, 21 Oct 2023 14:43:13 +0000 Message-Id: <20231021144317.3400916-3-chengming.zhou@linux.dev> In-Reply-To: <20231021144317.3400916-1-chengming.zhou@linux.dev> References: <20231021144317.3400916-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Now the partial slub will be frozen when taken out of node partial list, so the __slab_free() will know from "was_frozen" that the partial slab is not on node partial list and is used by one kmem_cache_cpu. But we will change this, make partial slabs leave the node partial list with unfrozen state, so we need to change __slab_free() to use the new slab_test_node_partial() we just introduced. Signed-off-by: Chengming Zhou Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index 3fad4edca34b..adeff8df85ec 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3610,6 +3610,7 @@ static void __slab_free(struct kmem_cache *s, struct = slab *slab, unsigned long counters; struct kmem_cache_node *n =3D NULL; unsigned long flags; + bool on_node_partial; =20 stat(s, FREE_SLOWPATH); =20 @@ -3657,6 +3658,7 @@ static void __slab_free(struct kmem_cache *s, struct = slab *slab, */ spin_lock_irqsave(&n->list_lock, flags); =20 + on_node_partial =3D slab_test_node_partial(slab); } } =20 @@ -3685,6 +3687,15 @@ static void __slab_free(struct kmem_cache *s, struct= slab *slab, return; } =20 + /* + * This slab was not full and not on the per-node partial list either, + * in which case we shouldn't manipulate its list, just early return. + */ + if (prior && !on_node_partial) { + spin_unlock_irqrestore(&n->list_lock, flags); + return; + } + if (unlikely(!new.inuse && n->nr_partial >=3D s->min_partial)) goto slab_empty; =20 --=20 2.20.1 From nobody Sat Feb 7 17:49:54 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17E50C004C0 for ; Sat, 21 Oct 2023 14:44:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231442AbjJUOoT (ORCPT ); Sat, 21 Oct 2023 10:44:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231429AbjJUOoR (ORCPT ); Sat, 21 Oct 2023 10:44:17 -0400 Received: from out-210.mta1.migadu.com (out-210.mta1.migadu.com [IPv6:2001:41d0:203:375::d2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CBC0D7C for ; Sat, 21 Oct 2023 07:44:13 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697899451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fc6duZ/q+apDqgoP4cLzy3p5t8cD8/eKarQEAhD6cwA=; b=VwdHbGN6iqPqOPPmjkoCZCqeya5Wrnfy29DfCh17oso4mTQaIrjGQ/bBj7cuImP6Ak0ZDt 74Ji4pLY0KWggdSHOhw8II53OTlr2K3tbgg2NViY04lO63dYbYrl8s/FsQ+pKx4DbOjdEG yYcnIac3YU/gH6s55G+958BKYZfu70Y= From: chengming.zhou@linux.dev To: cl@linux.com, penberg@kernel.org Cc: rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chengming.zhou@linux.dev, Chengming Zhou Subject: [RFC PATCH v2 3/6] slub: Don't freeze slabs for cpu partial Date: Sat, 21 Oct 2023 14:43:14 +0000 Message-Id: <20231021144317.3400916-4-chengming.zhou@linux.dev> In-Reply-To: <20231021144317.3400916-1-chengming.zhou@linux.dev> References: <20231021144317.3400916-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Now we will freeze slabs when moving them out of node partial list to cpu partial list, this method needs two cmpxchg_double operations: 1. freeze slab (acquire_slab()) under the node list_lock 2. get_freelist() when pick used in ___slab_alloc() Actually we don't need to freeze when moving slabs out of node partial list, we can delay freeze to use slab freelist in ___slab_alloc(), so we can save one cmpxchg_double(). And there are other good points: 1. The moving of slabs between node partial list and cpu partial list becomes simpler, since we don't need to freeze or unfreeze at all. 2. The node list_lock contention would be less, since we only need to freeze one slab under the node list_lock. (In fact, we can first move slabs out of node partial list, don't need to freeze any slab at all, so the contention on slab won't transfer to the node list_lock contention.) We can achieve this because there is no concurrent path would manipulate the partial slab list except the __slab_free() path, which is serialized now. Note this patch just change the parts of moving the partial slabs for easy code review, we will fix other parts in the following patches. Specifically this patch change three paths: 1. get partial slab from node: get_partial_node() 2. put partial slab to node: __unfreeze_partials() 3. cache partail slab on cpu when __slab_free() Signed-off-by: Chengming Zhou Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 63 +++++++++++++++++-------------------------------------- 1 file changed, 19 insertions(+), 44 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index adeff8df85ec..61ee82ea21b6 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2277,7 +2277,9 @@ static void *get_partial_node(struct kmem_cache *s, s= truct kmem_cache_node *n, struct slab *slab, *slab2; void *object =3D NULL; unsigned long flags; +#ifdef CONFIG_SLUB_CPU_PARTIAL unsigned int partial_slabs =3D 0; +#endif =20 /* * Racy check. If we mistakenly see no partial slabs then we @@ -2303,20 +2305,22 @@ static void *get_partial_node(struct kmem_cache *s,= struct kmem_cache_node *n, continue; } =20 - t =3D acquire_slab(s, n, slab, object =3D=3D NULL); - if (!t) - break; - if (!object) { - *pc->slab =3D slab; - stat(s, ALLOC_FROM_PARTIAL); - object =3D t; - } else { - put_cpu_partial(s, slab, 0); - stat(s, CPU_PARTIAL_NODE); - partial_slabs++; + t =3D acquire_slab(s, n, slab, object =3D=3D NULL); + if (t) { + *pc->slab =3D slab; + stat(s, ALLOC_FROM_PARTIAL); + object =3D t; + continue; + } } + #ifdef CONFIG_SLUB_CPU_PARTIAL + remove_partial(n, slab); + put_cpu_partial(s, slab, 0); + stat(s, CPU_PARTIAL_NODE); + partial_slabs++; + if (!kmem_cache_has_cpu_partial(s) || partial_slabs > s->cpu_partial_slabs / 2) break; @@ -2606,9 +2610,6 @@ static void __unfreeze_partials(struct kmem_cache *s,= struct slab *partial_slab) unsigned long flags =3D 0; =20 while (partial_slab) { - struct slab new; - struct slab old; - slab =3D partial_slab; partial_slab =3D slab->next; =20 @@ -2621,23 +2622,7 @@ static void __unfreeze_partials(struct kmem_cache *s= , struct slab *partial_slab) spin_lock_irqsave(&n->list_lock, flags); } =20 - do { - - old.freelist =3D slab->freelist; - old.counters =3D slab->counters; - VM_BUG_ON(!old.frozen); - - new.counters =3D old.counters; - new.freelist =3D old.freelist; - - new.frozen =3D 0; - - } while (!__slab_update_freelist(s, slab, - old.freelist, old.counters, - new.freelist, new.counters, - "unfreezing slab")); - - if (unlikely(!new.inuse && n->nr_partial >=3D s->min_partial)) { + if (unlikely(!slab->inuse && n->nr_partial >=3D s->min_partial)) { slab->next =3D slab_to_discard; slab_to_discard =3D slab; } else { @@ -3634,18 +3619,8 @@ static void __slab_free(struct kmem_cache *s, struct= slab *slab, was_frozen =3D new.frozen; new.inuse -=3D cnt; if ((!new.inuse || !prior) && !was_frozen) { - - if (kmem_cache_has_cpu_partial(s) && !prior) { - - /* - * Slab was on no list before and will be - * partially empty - * We can defer the list move and instead - * freeze it. - */ - new.frozen =3D 1; - - } else { /* Needs to be taken off a list */ + /* Needs to be taken off a list */ + if (!kmem_cache_has_cpu_partial(s) || prior) { =20 n =3D get_node(s, slab_nid(slab)); /* @@ -3675,7 +3650,7 @@ static void __slab_free(struct kmem_cache *s, struct = slab *slab, * activity can be necessary. */ stat(s, FREE_FROZEN); - } else if (new.frozen) { + } else if (kmem_cache_has_cpu_partial(s) && !prior) { /* * If we just froze the slab then put it onto the * per cpu partial list. --=20 2.20.1 From nobody Sat Feb 7 17:49:54 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FE7BC004C0 for ; Sat, 21 Oct 2023 14:44:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231490AbjJUOoa (ORCPT ); Sat, 21 Oct 2023 10:44:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231476AbjJUOo2 (ORCPT ); Sat, 21 Oct 2023 10:44:28 -0400 Received: from out-191.mta1.migadu.com (out-191.mta1.migadu.com [95.215.58.191]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EA1610D1 for ; Sat, 21 Oct 2023 07:44:20 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697899458; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nWmtXPvIlob45jfvRK/RuModS0a7G4PIYKuavHC2hOU=; b=PRz3bzrmS8EEMlbH+t4XO9cRe385YlpMQbBC+PJkV+mYGBOQDvOobTnnsiBIQwzbWk3dGH 53MwwtqQ/0FHj6idagxgeusX/wx+o/QH+LERI272V7ONmJtNIhQsZj5RYWyxmNgUWef9WU zq8+O024g84F8rDMjvJ1FPttWXr+WJw= From: chengming.zhou@linux.dev To: cl@linux.com, penberg@kernel.org Cc: rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chengming.zhou@linux.dev, Chengming Zhou Subject: [RFC PATCH v2 4/6] slub: Simplify acquire_slab() Date: Sat, 21 Oct 2023 14:43:15 +0000 Message-Id: <20231021144317.3400916-5-chengming.zhou@linux.dev> In-Reply-To: <20231021144317.3400916-1-chengming.zhou@linux.dev> References: <20231021144317.3400916-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Now the object =3D=3D NULL is always true, simplify acquire_slab(). Signed-off-by: Chengming Zhou Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 61ee82ea21b6..9f0b80fefc70 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2222,8 +2222,7 @@ static void *alloc_single_from_new_slab(struct kmem_c= ache *s, * Returns a list of objects or NULL if it fails. */ static inline void *acquire_slab(struct kmem_cache *s, - struct kmem_cache_node *n, struct slab *slab, - int mode) + struct kmem_cache_node *n, struct slab *slab) { void *freelist; unsigned long counters; @@ -2239,12 +2238,8 @@ static inline void *acquire_slab(struct kmem_cache *= s, freelist =3D slab->freelist; counters =3D slab->counters; new.counters =3D counters; - if (mode) { - new.inuse =3D slab->objects; - new.freelist =3D NULL; - } else { - new.freelist =3D freelist; - } + new.inuse =3D slab->objects; + new.freelist =3D NULL; =20 VM_BUG_ON(new.frozen); new.frozen =3D 1; @@ -2306,7 +2301,7 @@ static void *get_partial_node(struct kmem_cache *s, s= truct kmem_cache_node *n, } =20 if (!object) { - t =3D acquire_slab(s, n, slab, object =3D=3D NULL); + t =3D acquire_slab(s, n, slab); if (t) { *pc->slab =3D slab; stat(s, ALLOC_FROM_PARTIAL); --=20 2.20.1 From nobody Sat Feb 7 17:49:54 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68C5CC004C0 for ; Sat, 21 Oct 2023 14:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231347AbjJUOoj (ORCPT ); Sat, 21 Oct 2023 10:44:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231496AbjJUOog (ORCPT ); Sat, 21 Oct 2023 10:44:36 -0400 Received: from out-198.mta1.migadu.com (out-198.mta1.migadu.com [IPv6:2001:41d0:203:375::c6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9CE310CF for ; Sat, 21 Oct 2023 07:44:27 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697899466; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tHWic5xUklM9ZxSjedPZ6GHTnBcatj0QDeFJJJdaZnI=; b=xZpLnPA5vle6JTPcFbJ0n67rIsmg5XrvSZmL7TRfIktWc0DANpJ7X1ziqse/jIv0Pz1dUf o7Qc3QwNW64n42WyqJ9gTrBPvWfFdigRhBwW2USYW3Yx3/zSf5l0CB9wXdlFqaxE+Px/su W+CQPFSDyncBRZEtuylakjuQDjHwMzc= From: chengming.zhou@linux.dev To: cl@linux.com, penberg@kernel.org Cc: rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chengming.zhou@linux.dev, Chengming Zhou Subject: [RFC PATCH v2 5/6] slub: Introduce get_cpu_partial() Date: Sat, 21 Oct 2023 14:43:16 +0000 Message-Id: <20231021144317.3400916-6-chengming.zhou@linux.dev> In-Reply-To: <20231021144317.3400916-1-chengming.zhou@linux.dev> References: <20231021144317.3400916-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Since the slabs on cpu partial list are not frozen anymore, we introduce get_cpu_partial() to get a frozen slab with its freelist from cpu partial list. It's now much like getting a frozen slab with its freelist from node partial list. Another change is about get_partial(), which can return no frozen slab when all slabs are failed when acquire_slab(), but get some unfreeze slabs in its cpu partial list, so we need to check this rare case to avoid allocating a new slab. Signed-off-by: Chengming Zhou Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 87 +++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 68 insertions(+), 19 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 9f0b80fefc70..7fae959c56eb 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3055,6 +3055,68 @@ static inline void *get_freelist(struct kmem_cache *= s, struct slab *slab) return freelist; } =20 +#ifdef CONFIG_SLUB_CPU_PARTIAL + +static void *get_cpu_partial(struct kmem_cache *s, struct kmem_cache_cpu *= c, + struct slab **slabptr, int node, gfp_t gfpflags) +{ + unsigned long flags; + struct slab *slab; + struct slab new; + unsigned long counters; + void *freelist; + + while (slub_percpu_partial(c)) { + local_lock_irqsave(&s->cpu_slab->lock, flags); + if (unlikely(!slub_percpu_partial(c))) { + local_unlock_irqrestore(&s->cpu_slab->lock, flags); + /* we were preempted and partial list got empty */ + return NULL; + } + + slab =3D slub_percpu_partial(c); + slub_set_percpu_partial(c, slab); + local_unlock_irqrestore(&s->cpu_slab->lock, flags); + stat(s, CPU_PARTIAL_ALLOC); + + if (unlikely(!node_match(slab, node) || + !pfmemalloc_match(slab, gfpflags))) { + slab->next =3D NULL; + __unfreeze_partials(s, slab); + continue; + } + + do { + freelist =3D slab->freelist; + counters =3D slab->counters; + + new.counters =3D counters; + VM_BUG_ON(new.frozen); + + new.inuse =3D slab->objects; + new.frozen =3D 1; + } while (!__slab_update_freelist(s, slab, + freelist, counters, + NULL, new.counters, + "get_cpu_partial")); + + *slabptr =3D slab; + return freelist; + } + + return NULL; +} + +#else /* CONFIG_SLUB_CPU_PARTIAL */ + +static void *get_cpu_partial(struct kmem_cache *s, struct kmem_cache_cpu *= c, + struct slab **slabptr, int node, gfp_t gfpflags) +{ + return NULL; +} + +#endif + /* * Slow path. The lockless freelist is empty or we need to perform * debugging duties. @@ -3097,7 +3159,6 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_= t gfpflags, int node, node =3D NUMA_NO_NODE; goto new_slab; } -redo: =20 if (unlikely(!node_match(slab, node))) { /* @@ -3173,24 +3234,9 @@ static void *___slab_alloc(struct kmem_cache *s, gfp= _t gfpflags, int node, =20 new_slab: =20 - if (slub_percpu_partial(c)) { - local_lock_irqsave(&s->cpu_slab->lock, flags); - if (unlikely(c->slab)) { - local_unlock_irqrestore(&s->cpu_slab->lock, flags); - goto reread_slab; - } - if (unlikely(!slub_percpu_partial(c))) { - local_unlock_irqrestore(&s->cpu_slab->lock, flags); - /* we were preempted and partial list got empty */ - goto new_objects; - } - - slab =3D c->slab =3D slub_percpu_partial(c); - slub_set_percpu_partial(c, slab); - local_unlock_irqrestore(&s->cpu_slab->lock, flags); - stat(s, CPU_PARTIAL_ALLOC); - goto redo; - } + freelist =3D get_cpu_partial(s, c, &slab, node, gfpflags); + if (freelist) + goto retry_load_slab; =20 new_objects: =20 @@ -3201,6 +3247,9 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_= t gfpflags, int node, if (freelist) goto check_new_slab; =20 + if (slub_percpu_partial(c)) + goto new_slab; + slub_put_cpu_ptr(s->cpu_slab); slab =3D new_slab(s, gfpflags, node); c =3D slub_get_cpu_ptr(s->cpu_slab); --=20 2.20.1 From nobody Sat Feb 7 17:49:54 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B945C004C0 for ; Sat, 21 Oct 2023 14:44:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231516AbjJUOor (ORCPT ); Sat, 21 Oct 2023 10:44:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231409AbjJUOoq (ORCPT ); Sat, 21 Oct 2023 10:44:46 -0400 Received: from out-200.mta1.migadu.com (out-200.mta1.migadu.com [95.215.58.200]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33EA410EA for ; Sat, 21 Oct 2023 07:44:35 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697899473; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=R6H+QGgDZUUSLw4c5cGh7W9M5d6koieepx0jv1lm/PY=; b=nBlo2iOcMyh6STRK6arKHuu2OtwYxeBOEZUYaYEoR/aFOVymXtmXiP9LRc647ZUIG7Opee H74yoyYmsNA9MpuqduzBOiT57hUIKMOEgLDg2dwwGPS9Y+i0M3NpXEK/xqZUJNEg8VERd+ ScKNE5ibttZh3eJk/XwCQSEMgTaBodM= From: chengming.zhou@linux.dev To: cl@linux.com, penberg@kernel.org Cc: rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, willy@infradead.org, pcc@google.com, tytso@mit.edu, maz@kernel.org, ruansy.fnst@fujitsu.com, vishal.moola@gmail.com, lrh2000@pku.edu.cn, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, chengming.zhou@linux.dev, Chengming Zhou Subject: [RFC PATCH v2 6/6] slub: Optimize deactivate_slab() Date: Sat, 21 Oct 2023 14:43:17 +0000 Message-Id: <20231021144317.3400916-7-chengming.zhou@linux.dev> In-Reply-To: <20231021144317.3400916-1-chengming.zhou@linux.dev> References: <20231021144317.3400916-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Since the introduce of unfrozen slabs on cpu partial list, we don't need to synchronize the slab frozen state under the node list_lock. The caller of deactivate_slab() and the caller of __slab_free() won't manipulate the slab list concurrently. So we can get node list_lock in the stage three if we need to manipulate the slab list in this path. Signed-off-by: Chengming Zhou Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 70 ++++++++++++++++++++----------------------------------- 1 file changed, 25 insertions(+), 45 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 7fae959c56eb..29a60bfbf9c5 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2493,10 +2493,8 @@ static void init_kmem_cache_cpus(struct kmem_cache *= s) static void deactivate_slab(struct kmem_cache *s, struct slab *slab, void *freelist) { - enum slab_modes { M_NONE, M_PARTIAL, M_FREE, M_FULL_NOLIST }; struct kmem_cache_node *n =3D get_node(s, slab_nid(slab)); int free_delta =3D 0; - enum slab_modes mode =3D M_NONE; void *nextfree, *freelist_iter, *freelist_tail; int tail =3D DEACTIVATE_TO_HEAD; unsigned long flags =3D 0; @@ -2543,58 +2541,40 @@ static void deactivate_slab(struct kmem_cache *s, s= truct slab *slab, * unfrozen and number of objects in the slab may have changed. * Then release lock and retry cmpxchg again. */ -redo: - - old.freelist =3D READ_ONCE(slab->freelist); - old.counters =3D READ_ONCE(slab->counters); - VM_BUG_ON(!old.frozen); - - /* Determine target state of the slab */ - new.counters =3D old.counters; - if (freelist_tail) { - new.inuse -=3D free_delta; - set_freepointer(s, freelist_tail, old.freelist); - new.freelist =3D freelist; - } else - new.freelist =3D old.freelist; + do { + old.freelist =3D READ_ONCE(slab->freelist); + old.counters =3D READ_ONCE(slab->counters); + VM_BUG_ON(!old.frozen); + + /* Determine target state of the slab */ + new.counters =3D old.counters; + new.frozen =3D 0; + if (freelist_tail) { + new.inuse -=3D free_delta; + set_freepointer(s, freelist_tail, old.freelist); + new.freelist =3D freelist; + } else + new.freelist =3D old.freelist; =20 - new.frozen =3D 0; + } while (!slab_update_freelist(s, slab, + old.freelist, old.counters, + new.freelist, new.counters, + "unfreezing slab")); =20 + /* + * Stage three: Manipulate the slab list based on the updated state. + */ if (!new.inuse && n->nr_partial >=3D s->min_partial) { - mode =3D M_FREE; + stat(s, DEACTIVATE_EMPTY); + discard_slab(s, slab); + stat(s, FREE_SLAB); } else if (new.freelist) { - mode =3D M_PARTIAL; - /* - * Taking the spinlock removes the possibility that - * acquire_slab() will see a slab that is frozen - */ spin_lock_irqsave(&n->list_lock, flags); - } else { - mode =3D M_FULL_NOLIST; - } - - - if (!slab_update_freelist(s, slab, - old.freelist, old.counters, - new.freelist, new.counters, - "unfreezing slab")) { - if (mode =3D=3D M_PARTIAL) - spin_unlock_irqrestore(&n->list_lock, flags); - goto redo; - } - - - if (mode =3D=3D M_PARTIAL) { add_partial(n, slab, tail); spin_unlock_irqrestore(&n->list_lock, flags); stat(s, tail); - } else if (mode =3D=3D M_FREE) { - stat(s, DEACTIVATE_EMPTY); - discard_slab(s, slab); - stat(s, FREE_SLAB); - } else if (mode =3D=3D M_FULL_NOLIST) { + } else stat(s, DEACTIVATE_FULL); - } } =20 #ifdef CONFIG_SLUB_CPU_PARTIAL --=20 2.20.1