[PATCH RFC v2 04/20] slab: add sheaves to most caches

Vlastimil Babka posted 20 patches 4 weeks ago
There is a newer version of this series
[PATCH RFC v2 04/20] slab: add sheaves to most caches
Posted by Vlastimil Babka 4 weeks ago
In the first step to replace cpu (partial) slabs with sheaves, enable
sheaves for almost all caches. Treat args->sheaf_capacity as a minimum,
and calculate sheaf capacity with a formula that roughly follows the
formula for number of objects in cpu partial slabs in set_cpu_partial().

This should achieve roughly similar contention on the barn spin lock as
there's currently for node list_lock without sheaves, to make
benchmarking results comparable. It can be further tuned later.

Don't enable sheaves for bootstrap caches as that wouldn't work. In
order to recognize them by SLAB_NO_OBJ_EXT, make sure the flag exists
even for !CONFIG_SLAB_OBJ_EXT.

This limitation will be lifted for kmalloc caches after the necessary
bootstrapping changes.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/slab.h |  6 ------
 mm/slub.c            | 51 +++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 2482992248dc..2682ee57ec90 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -57,9 +57,7 @@ enum _slab_flag_bits {
 #endif
 	_SLAB_OBJECT_POISON,
 	_SLAB_CMPXCHG_DOUBLE,
-#ifdef CONFIG_SLAB_OBJ_EXT
 	_SLAB_NO_OBJ_EXT,
-#endif
 	_SLAB_FLAGS_LAST_BIT
 };
 
@@ -238,11 +236,7 @@ enum _slab_flag_bits {
 #define SLAB_TEMPORARY		SLAB_RECLAIM_ACCOUNT	/* Objects are short-lived */
 
 /* Slab created using create_boot_cache */
-#ifdef CONFIG_SLAB_OBJ_EXT
 #define SLAB_NO_OBJ_EXT		__SLAB_FLAG_BIT(_SLAB_NO_OBJ_EXT)
-#else
-#define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
-#endif
 
 /*
  * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
diff --git a/mm/slub.c b/mm/slub.c
index 8ffeb3ab3228..6e05e3cc5c49 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7857,6 +7857,48 @@ static void set_cpu_partial(struct kmem_cache *s)
 #endif
 }
 
+static unsigned int calculate_sheaf_capacity(struct kmem_cache *s,
+					     struct kmem_cache_args *args)
+
+{
+	unsigned int capacity;
+	size_t size;
+
+
+	if (IS_ENABLED(CONFIG_SLUB_TINY) || s->flags & SLAB_DEBUG_FLAGS)
+		return 0;
+
+	/* bootstrap caches can't have sheaves for now */
+	if (s->flags & SLAB_NO_OBJ_EXT)
+		return 0;
+
+	/*
+	 * For now we use roughly similar formula (divided by two as there are
+	 * two percpu sheaves) as what was used for percpu partial slabs, which
+	 * should result in similar lock contention (barn or list_lock)
+	 */
+	if (s->size >= PAGE_SIZE)
+		capacity = 4;
+	else if (s->size >= 1024)
+		capacity = 12;
+	else if (s->size >= 256)
+		capacity = 26;
+	else
+		capacity = 60;
+
+	/* Increment capacity to make sheaf exactly a kmalloc size bucket */
+	size = struct_size_t(struct slab_sheaf, objects, capacity);
+	size = kmalloc_size_roundup(size);
+	capacity = (size - struct_size_t(struct slab_sheaf, objects, 0)) / sizeof(void *);
+
+	/*
+	 * Respect an explicit request for capacity that's typically motivated by
+	 * expected maximum size of kmem_cache_prefill_sheaf() to not end up
+	 * using low-performance oversize sheaves
+	 */
+	return max(capacity, args->sheaf_capacity);
+}
+
 /*
  * calculate_sizes() determines the order and the distribution of data within
  * a slab object.
@@ -7991,6 +8033,10 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
 	if (s->flags & SLAB_RECLAIM_ACCOUNT)
 		s->allocflags |= __GFP_RECLAIMABLE;
 
+	/* kmalloc caches need extra care to support sheaves */
+	if (!is_kmalloc_cache(s))
+		s->sheaf_capacity = calculate_sheaf_capacity(s, args);
+
 	/*
 	 * Determine the number of objects per slab
 	 */
@@ -8595,15 +8641,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
 
 	set_cpu_partial(s);
 
-	if (args->sheaf_capacity && !IS_ENABLED(CONFIG_SLUB_TINY)
-					&& !(s->flags & SLAB_DEBUG_FLAGS)) {
+	if (s->sheaf_capacity) {
 		s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
 		if (!s->cpu_sheaves) {
 			err = -ENOMEM;
 			goto out;
 		}
-		// TODO: increase capacity to grow slab_sheaf up to next kmalloc size?
-		s->sheaf_capacity = args->sheaf_capacity;
 	}
 
 #ifdef CONFIG_NUMA

-- 
2.52.0
Re: [PATCH RFC v2 04/20] slab: add sheaves to most caches
Posted by Harry Yoo 3 weeks, 3 days ago
On Mon, Jan 12, 2026 at 04:16:58PM +0100, Vlastimil Babka wrote:
> In the first step to replace cpu (partial) slabs with sheaves, enable
> sheaves for almost all caches. Treat args->sheaf_capacity as a minimum,
> and calculate sheaf capacity with a formula that roughly follows the
> formula for number of objects in cpu partial slabs in set_cpu_partial().
> 
> This should achieve roughly similar contention on the barn spin lock as
> there's currently for node list_lock without sheaves, to make
> benchmarking results comparable. It can be further tuned later.
> 
> Don't enable sheaves for bootstrap caches as that wouldn't work. In
> order to recognize them by SLAB_NO_OBJ_EXT, make sure the flag exists
> even for !CONFIG_SLAB_OBJ_EXT.
> 
> This limitation will be lifted for kmalloc caches after the necessary
> bootstrapping changes.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---

Looks good to me,
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>

-- 
Cheers,
Harry / Hyeonggon
Re: [PATCH RFC v2 04/20] slab: add sheaves to most caches
Posted by Suren Baghdasaryan 3 weeks, 3 days ago
On Mon, Jan 12, 2026 at 3:17 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> In the first step to replace cpu (partial) slabs with sheaves, enable
> sheaves for almost all caches. Treat args->sheaf_capacity as a minimum,
> and calculate sheaf capacity with a formula that roughly follows the
> formula for number of objects in cpu partial slabs in set_cpu_partial().
>
> This should achieve roughly similar contention on the barn spin lock as
> there's currently for node list_lock without sheaves, to make
> benchmarking results comparable. It can be further tuned later.
>
> Don't enable sheaves for bootstrap caches as that wouldn't work. In
> order to recognize them by SLAB_NO_OBJ_EXT, make sure the flag exists
> even for !CONFIG_SLAB_OBJ_EXT.
>
> This limitation will be lifted for kmalloc caches after the necessary
> bootstrapping changes.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

One nit but otherwise LGTM.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

> ---
>  include/linux/slab.h |  6 ------
>  mm/slub.c            | 51 +++++++++++++++++++++++++++++++++++++++++++++++----
>  2 files changed, 47 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 2482992248dc..2682ee57ec90 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -57,9 +57,7 @@ enum _slab_flag_bits {
>  #endif
>         _SLAB_OBJECT_POISON,
>         _SLAB_CMPXCHG_DOUBLE,
> -#ifdef CONFIG_SLAB_OBJ_EXT
>         _SLAB_NO_OBJ_EXT,
> -#endif
>         _SLAB_FLAGS_LAST_BIT
>  };
>
> @@ -238,11 +236,7 @@ enum _slab_flag_bits {
>  #define SLAB_TEMPORARY         SLAB_RECLAIM_ACCOUNT    /* Objects are short-lived */
>
>  /* Slab created using create_boot_cache */
> -#ifdef CONFIG_SLAB_OBJ_EXT
>  #define SLAB_NO_OBJ_EXT                __SLAB_FLAG_BIT(_SLAB_NO_OBJ_EXT)
> -#else
> -#define SLAB_NO_OBJ_EXT                __SLAB_FLAG_UNUSED
> -#endif
>
>  /*
>   * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
> diff --git a/mm/slub.c b/mm/slub.c
> index 8ffeb3ab3228..6e05e3cc5c49 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -7857,6 +7857,48 @@ static void set_cpu_partial(struct kmem_cache *s)
>  #endif
>  }
>
> +static unsigned int calculate_sheaf_capacity(struct kmem_cache *s,
> +                                            struct kmem_cache_args *args)
> +
> +{
> +       unsigned int capacity;
> +       size_t size;
> +
> +
> +       if (IS_ENABLED(CONFIG_SLUB_TINY) || s->flags & SLAB_DEBUG_FLAGS)
> +               return 0;
> +
> +       /* bootstrap caches can't have sheaves for now */
> +       if (s->flags & SLAB_NO_OBJ_EXT)
> +               return 0;
> +
> +       /*
> +        * For now we use roughly similar formula (divided by two as there are
> +        * two percpu sheaves) as what was used for percpu partial slabs, which
> +        * should result in similar lock contention (barn or list_lock)
> +        */
> +       if (s->size >= PAGE_SIZE)
> +               capacity = 4;
> +       else if (s->size >= 1024)
> +               capacity = 12;
> +       else if (s->size >= 256)
> +               capacity = 26;
> +       else
> +               capacity = 60;
> +
> +       /* Increment capacity to make sheaf exactly a kmalloc size bucket */
> +       size = struct_size_t(struct slab_sheaf, objects, capacity);
> +       size = kmalloc_size_roundup(size);
> +       capacity = (size - struct_size_t(struct slab_sheaf, objects, 0)) / sizeof(void *);
> +
> +       /*
> +        * Respect an explicit request for capacity that's typically motivated by
> +        * expected maximum size of kmem_cache_prefill_sheaf() to not end up
> +        * using low-performance oversize sheaves
> +        */
> +       return max(capacity, args->sheaf_capacity);
> +}
> +
>  /*
>   * calculate_sizes() determines the order and the distribution of data within
>   * a slab object.
> @@ -7991,6 +8033,10 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
>         if (s->flags & SLAB_RECLAIM_ACCOUNT)
>                 s->allocflags |= __GFP_RECLAIMABLE;
>
> +       /* kmalloc caches need extra care to support sheaves */
> +       if (!is_kmalloc_cache(s))

nit: All the checks for the cases when sheaves should not be used
(like SLAB_DEBUG_FLAGS and SLAB_NO_OBJ_EXT) are done inside
calculate_sheaf_capacity(). Only this is_kmalloc_cache() one is here.
It would be nice to have all of them in the same place but maybe you
have a reason for keeping it here?

> +               s->sheaf_capacity = calculate_sheaf_capacity(s, args);
> +
>         /*
>          * Determine the number of objects per slab
>          */
> @@ -8595,15 +8641,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
>
>         set_cpu_partial(s);
>
> -       if (args->sheaf_capacity && !IS_ENABLED(CONFIG_SLUB_TINY)
> -                                       && !(s->flags & SLAB_DEBUG_FLAGS)) {
> +       if (s->sheaf_capacity) {
>                 s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
>                 if (!s->cpu_sheaves) {
>                         err = -ENOMEM;
>                         goto out;
>                 }
> -               // TODO: increase capacity to grow slab_sheaf up to next kmalloc size?
> -               s->sheaf_capacity = args->sheaf_capacity;
>         }
>
>  #ifdef CONFIG_NUMA
>
> --
> 2.52.0
>
Re: [PATCH RFC v2 04/20] slab: add sheaves to most caches
Posted by Vlastimil Babka 3 weeks, 3 days ago
On 1/16/26 06:45, Suren Baghdasaryan wrote:
> On Mon, Jan 12, 2026 at 3:17 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> In the first step to replace cpu (partial) slabs with sheaves, enable
>> sheaves for almost all caches. Treat args->sheaf_capacity as a minimum,
>> and calculate sheaf capacity with a formula that roughly follows the
>> formula for number of objects in cpu partial slabs in set_cpu_partial().
>>
>> This should achieve roughly similar contention on the barn spin lock as
>> there's currently for node list_lock without sheaves, to make
>> benchmarking results comparable. It can be further tuned later.
>>
>> Don't enable sheaves for bootstrap caches as that wouldn't work. In
>> order to recognize them by SLAB_NO_OBJ_EXT, make sure the flag exists
>> even for !CONFIG_SLAB_OBJ_EXT.
>>
>> This limitation will be lifted for kmalloc caches after the necessary
>> bootstrapping changes.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> 
> One nit but otherwise LGTM.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>

Thanks.

>> ---
>>  include/linux/slab.h |  6 ------
>>  mm/slub.c            | 51 +++++++++++++++++++++++++++++++++++++++++++++++----
>>  2 files changed, 47 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 2482992248dc..2682ee57ec90 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -57,9 +57,7 @@ enum _slab_flag_bits {
>>  #endif
>>         _SLAB_OBJECT_POISON,
>>         _SLAB_CMPXCHG_DOUBLE,
>> -#ifdef CONFIG_SLAB_OBJ_EXT
>>         _SLAB_NO_OBJ_EXT,
>> -#endif
>>         _SLAB_FLAGS_LAST_BIT
>>  };
>>
>> @@ -238,11 +236,7 @@ enum _slab_flag_bits {
>>  #define SLAB_TEMPORARY         SLAB_RECLAIM_ACCOUNT    /* Objects are short-lived */
>>
>>  /* Slab created using create_boot_cache */
>> -#ifdef CONFIG_SLAB_OBJ_EXT
>>  #define SLAB_NO_OBJ_EXT                __SLAB_FLAG_BIT(_SLAB_NO_OBJ_EXT)
>> -#else
>> -#define SLAB_NO_OBJ_EXT                __SLAB_FLAG_UNUSED
>> -#endif
>>
>>  /*
>>   * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 8ffeb3ab3228..6e05e3cc5c49 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -7857,6 +7857,48 @@ static void set_cpu_partial(struct kmem_cache *s)
>>  #endif
>>  }
>>
>> +static unsigned int calculate_sheaf_capacity(struct kmem_cache *s,
>> +                                            struct kmem_cache_args *args)
>> +
>> +{
>> +       unsigned int capacity;
>> +       size_t size;
>> +
>> +
>> +       if (IS_ENABLED(CONFIG_SLUB_TINY) || s->flags & SLAB_DEBUG_FLAGS)
>> +               return 0;
>> +
>> +       /* bootstrap caches can't have sheaves for now */
>> +       if (s->flags & SLAB_NO_OBJ_EXT)
>> +               return 0;
>> +
>> +       /*
>> +        * For now we use roughly similar formula (divided by two as there are
>> +        * two percpu sheaves) as what was used for percpu partial slabs, which
>> +        * should result in similar lock contention (barn or list_lock)
>> +        */
>> +       if (s->size >= PAGE_SIZE)
>> +               capacity = 4;
>> +       else if (s->size >= 1024)
>> +               capacity = 12;
>> +       else if (s->size >= 256)
>> +               capacity = 26;
>> +       else
>> +               capacity = 60;
>> +
>> +       /* Increment capacity to make sheaf exactly a kmalloc size bucket */
>> +       size = struct_size_t(struct slab_sheaf, objects, capacity);
>> +       size = kmalloc_size_roundup(size);
>> +       capacity = (size - struct_size_t(struct slab_sheaf, objects, 0)) / sizeof(void *);
>> +
>> +       /*
>> +        * Respect an explicit request for capacity that's typically motivated by
>> +        * expected maximum size of kmem_cache_prefill_sheaf() to not end up
>> +        * using low-performance oversize sheaves
>> +        */
>> +       return max(capacity, args->sheaf_capacity);
>> +}
>> +
>>  /*
>>   * calculate_sizes() determines the order and the distribution of data within
>>   * a slab object.
>> @@ -7991,6 +8033,10 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
>>         if (s->flags & SLAB_RECLAIM_ACCOUNT)
>>                 s->allocflags |= __GFP_RECLAIMABLE;
>>
>> +       /* kmalloc caches need extra care to support sheaves */
>> +       if (!is_kmalloc_cache(s))
> 
> nit: All the checks for the cases when sheaves should not be used
> (like SLAB_DEBUG_FLAGS and SLAB_NO_OBJ_EXT) are done inside
> calculate_sheaf_capacity(). Only this is_kmalloc_cache() one is here.
> It would be nice to have all of them in the same place but maybe you
> have a reason for keeping it here?

Yeah, in "slab: handle kmalloc sheaves bootstrap" we call
calculate_sheaf_capacity() from another place for kmalloc normal caches so
the check has to be outside.

>> +               s->sheaf_capacity = calculate_sheaf_capacity(s, args);
>> +
>>         /*
>>          * Determine the number of objects per slab
>>          */
>> @@ -8595,15 +8641,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
>>
>>         set_cpu_partial(s);
>>
>> -       if (args->sheaf_capacity && !IS_ENABLED(CONFIG_SLUB_TINY)
>> -                                       && !(s->flags & SLAB_DEBUG_FLAGS)) {
>> +       if (s->sheaf_capacity) {
>>                 s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
>>                 if (!s->cpu_sheaves) {
>>                         err = -ENOMEM;
>>                         goto out;
>>                 }
>> -               // TODO: increase capacity to grow slab_sheaf up to next kmalloc size?
>> -               s->sheaf_capacity = args->sheaf_capacity;
>>         }
>>
>>  #ifdef CONFIG_NUMA
>>
>> --
>> 2.52.0
>>

Re: [PATCH RFC v2 04/20] slab: add sheaves to most caches
Posted by Suren Baghdasaryan 3 weeks, 3 days ago
On Fri, Jan 16, 2026 at 3:24 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 1/16/26 06:45, Suren Baghdasaryan wrote:
> > On Mon, Jan 12, 2026 at 3:17 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> >>
> >> In the first step to replace cpu (partial) slabs with sheaves, enable
> >> sheaves for almost all caches. Treat args->sheaf_capacity as a minimum,
> >> and calculate sheaf capacity with a formula that roughly follows the
> >> formula for number of objects in cpu partial slabs in set_cpu_partial().
> >>
> >> This should achieve roughly similar contention on the barn spin lock as
> >> there's currently for node list_lock without sheaves, to make
> >> benchmarking results comparable. It can be further tuned later.
> >>
> >> Don't enable sheaves for bootstrap caches as that wouldn't work. In
> >> order to recognize them by SLAB_NO_OBJ_EXT, make sure the flag exists
> >> even for !CONFIG_SLAB_OBJ_EXT.
> >>
> >> This limitation will be lifted for kmalloc caches after the necessary
> >> bootstrapping changes.
> >>
> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >
> > One nit but otherwise LGTM.
> >
> > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> Thanks.
>
> >> ---
> >>  include/linux/slab.h |  6 ------
> >>  mm/slub.c            | 51 +++++++++++++++++++++++++++++++++++++++++++++++----
> >>  2 files changed, 47 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> index 2482992248dc..2682ee57ec90 100644
> >> --- a/include/linux/slab.h
> >> +++ b/include/linux/slab.h
> >> @@ -57,9 +57,7 @@ enum _slab_flag_bits {
> >>  #endif
> >>         _SLAB_OBJECT_POISON,
> >>         _SLAB_CMPXCHG_DOUBLE,
> >> -#ifdef CONFIG_SLAB_OBJ_EXT
> >>         _SLAB_NO_OBJ_EXT,
> >> -#endif
> >>         _SLAB_FLAGS_LAST_BIT
> >>  };
> >>
> >> @@ -238,11 +236,7 @@ enum _slab_flag_bits {
> >>  #define SLAB_TEMPORARY         SLAB_RECLAIM_ACCOUNT    /* Objects are short-lived */
> >>
> >>  /* Slab created using create_boot_cache */
> >> -#ifdef CONFIG_SLAB_OBJ_EXT
> >>  #define SLAB_NO_OBJ_EXT                __SLAB_FLAG_BIT(_SLAB_NO_OBJ_EXT)
> >> -#else
> >> -#define SLAB_NO_OBJ_EXT                __SLAB_FLAG_UNUSED
> >> -#endif
> >>
> >>  /*
> >>   * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index 8ffeb3ab3228..6e05e3cc5c49 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> >> @@ -7857,6 +7857,48 @@ static void set_cpu_partial(struct kmem_cache *s)
> >>  #endif
> >>  }
> >>
> >> +static unsigned int calculate_sheaf_capacity(struct kmem_cache *s,
> >> +                                            struct kmem_cache_args *args)
> >> +
> >> +{
> >> +       unsigned int capacity;
> >> +       size_t size;
> >> +
> >> +
> >> +       if (IS_ENABLED(CONFIG_SLUB_TINY) || s->flags & SLAB_DEBUG_FLAGS)
> >> +               return 0;
> >> +
> >> +       /* bootstrap caches can't have sheaves for now */
> >> +       if (s->flags & SLAB_NO_OBJ_EXT)
> >> +               return 0;
> >> +
> >> +       /*
> >> +        * For now we use roughly similar formula (divided by two as there are
> >> +        * two percpu sheaves) as what was used for percpu partial slabs, which
> >> +        * should result in similar lock contention (barn or list_lock)
> >> +        */
> >> +       if (s->size >= PAGE_SIZE)
> >> +               capacity = 4;
> >> +       else if (s->size >= 1024)
> >> +               capacity = 12;
> >> +       else if (s->size >= 256)
> >> +               capacity = 26;
> >> +       else
> >> +               capacity = 60;
> >> +
> >> +       /* Increment capacity to make sheaf exactly a kmalloc size bucket */
> >> +       size = struct_size_t(struct slab_sheaf, objects, capacity);
> >> +       size = kmalloc_size_roundup(size);
> >> +       capacity = (size - struct_size_t(struct slab_sheaf, objects, 0)) / sizeof(void *);
> >> +
> >> +       /*
> >> +        * Respect an explicit request for capacity that's typically motivated by
> >> +        * expected maximum size of kmem_cache_prefill_sheaf() to not end up
> >> +        * using low-performance oversize sheaves
> >> +        */
> >> +       return max(capacity, args->sheaf_capacity);
> >> +}
> >> +
> >>  /*
> >>   * calculate_sizes() determines the order and the distribution of data within
> >>   * a slab object.
> >> @@ -7991,6 +8033,10 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
> >>         if (s->flags & SLAB_RECLAIM_ACCOUNT)
> >>                 s->allocflags |= __GFP_RECLAIMABLE;
> >>
> >> +       /* kmalloc caches need extra care to support sheaves */
> >> +       if (!is_kmalloc_cache(s))
> >
> > nit: All the checks for the cases when sheaves should not be used
> > (like SLAB_DEBUG_FLAGS and SLAB_NO_OBJ_EXT) are done inside
> > calculate_sheaf_capacity(). Only this is_kmalloc_cache() one is here.
> > It would be nice to have all of them in the same place but maybe you
> > have a reason for keeping it here?
>
> Yeah, in "slab: handle kmalloc sheaves bootstrap" we call
> calculate_sheaf_capacity() from another place for kmalloc normal caches so
> the check has to be outside.

Ok, I suspected the answer will be in the later patches. Thanks!

>
> >> +               s->sheaf_capacity = calculate_sheaf_capacity(s, args);
> >> +
> >>         /*
> >>          * Determine the number of objects per slab
> >>          */
> >> @@ -8595,15 +8641,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
> >>
> >>         set_cpu_partial(s);
> >>
> >> -       if (args->sheaf_capacity && !IS_ENABLED(CONFIG_SLUB_TINY)
> >> -                                       && !(s->flags & SLAB_DEBUG_FLAGS)) {
> >> +       if (s->sheaf_capacity) {
> >>                 s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
> >>                 if (!s->cpu_sheaves) {
> >>                         err = -ENOMEM;
> >>                         goto out;
> >>                 }
> >> -               // TODO: increase capacity to grow slab_sheaf up to next kmalloc size?
> >> -               s->sheaf_capacity = args->sheaf_capacity;
> >>         }
> >>
> >>  #ifdef CONFIG_NUMA
> >>
> >> --
> >> 2.52.0
> >>
>