[PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path

hu.shengming@zte.com.cn posted 1 patch 4 days, 17 hours ago
There is a newer version of this series
mm/slub.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
[PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path
Posted by hu.shengming@zte.com.cn 4 days, 17 hours ago
From: Shengming Hu <hu.shengming@zte.com.cn>

_kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
initial allocation fails. The retry currently reuses `size` as the
bucket selector and overwrites it with s->object_size + 1.

That value is later passed as the original allocation size to
__slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
successful retry this makes KASAN/slub-debug observe the retry bucket
selector rather than the caller requested size, potentially widening the
valid kmalloc range and hiding overflows.

Keep a separate `bucket_size` for choosing the retry cache and preserve
`size`.

Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
---
 mm/slub.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 67abbbf68fc1..6a2b3ade3611 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
 void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
 {
 	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
+	size_t bucket_size = size;
 	struct kmem_cache *s;
 	bool can_retry = true;
 	void *ret;
@@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
 		return NULL;

 retry:
-	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
+	if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
 		return NULL;
-	s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
+	s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));

 	if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
 		/*
@@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
 	 */
 	if (!ret && can_retry) {
 		/* pick the next kmalloc bucket */
-		size = s->object_size + 1;
+		bucket_size = s->object_size + 1;
 		/*
 		 * Another alternative is to
 		 * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
-- 
2.25.1
Re: [PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path
Posted by Harry Yoo 4 days ago

On 6/3/26 10:10 PM, hu.shengming@zte.com.cn wrote:
> From: Shengming Hu <hu.shengming@zte.com.cn>
> 
> _kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
> initial allocation fails. The retry currently reuses `size` as the
> bucket selector and overwrites it with s->object_size + 1.
> 
> That value is later passed as the original allocation size to
> __slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
> successful retry this makes KASAN/slub-debug observe the retry bucket
> selector rather than the caller requested size, potentially widening the
> valid kmalloc range and hiding overflows.

Good catch!

> Keep a separate `bucket_size` for choosing the retry cache and preserve
> `size`.
> 
> Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
> Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
> ---

I want to note that this conflicts with Vlastimil's work-in-progress
feature [1] where it separately stores orig_size in struct
slab_alloc_context which naturally solves the problem.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags

No strong opinion on whether to queue this patch and then rebase
Vlastimil's work on top, or wait for Vlastimil's work to solve the
problem instead...

>  mm/slub.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 67abbbf68fc1..6a2b3ade3611 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
>  void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
>  {
>  	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
> +	size_t bucket_size = size;
>  	struct kmem_cache *s;
>  	bool can_retry = true;
>  	void *ret;

But in either way, I think it's more straightforward to introduce
orig_size as a variable to keep the original size and pass it to
__slab_alloc_node().

> @@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
>  		return NULL;
> 
>  retry:
> -	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> +	if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
>  		return NULL;
> -	s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> +	s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> 
>  	if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
>  		/*
> @@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
>  	 */
>  	if (!ret && can_retry) {
>  		/* pick the next kmalloc bucket */
> -		size = s->object_size + 1;
> +		bucket_size = s->object_size + 1;
>  		/*
>  		 * Another alternative is to
>  		 * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;

-- 
Cheers,
Harry / Hyeonggon

Re: [PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path
Posted by hu.shengming@zte.com.cn 3 days, 20 hours ago
Harry wrote:
> On 6/3/26 10:10 PM, hu.shengming@zte.com.cn wrote:
> > From: Shengming Hu <hu.shengming@zte.com.cn>
> > 
> > _kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
> > initial allocation fails. The retry currently reuses `size` as the
> > bucket selector and overwrites it with s->object_size + 1.
> > 
> > That value is later passed as the original allocation size to
> > __slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
> > successful retry this makes KASAN/slub-debug observe the retry bucket
> > selector rather than the caller requested size, potentially widening the
> > valid kmalloc range and hiding overflows.
> 
> Good catch!
> 

Hi Harry,

> > Keep a separate `bucket_size` for choosing the retry cache and preserve
> > `size`.
> > 
> > Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
> > Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
> > ---
> 
> I want to note that this conflicts with Vlastimil's work-in-progress
> feature [1] where it separately stores orig_size in struct
> slab_alloc_context which naturally solves the problem.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags
> 
> No strong opinion on whether to queue this patch and then rebase
> Vlastimil's work on top, or wait for Vlastimil's work to solve the
> problem instead...
> 

Thanks for the review and for pointing out Vlastimil's work-in-progress
series.

> >  mm/slub.c | 7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 67abbbf68fc1..6a2b3ade3611 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
> >  void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
> >  {
> >      gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
> > +    size_t bucket_size = size;
> >      struct kmem_cache *s;
> >      bool can_retry = true;
> >      void *ret;
> 
> But in either way, I think it's more straightforward to introduce
> orig_size as a variable to keep the original size and pass it to
> __slab_alloc_node().
> 

I agree that introducing an explicit orig_size is more straightforward
than adding bucket_size.

I'll update the patch to keep:

	size_t orig_size = size;

and use orig_size for the allocation hooks/KASAN paths, while allowing
size to continue being advanced for the retry cache selection.

> > @@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
> >          return NULL;
> > 
> >  retry:
> > -    if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> > +    if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
> >          return NULL;
> > -    s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> > +    s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> > 
> >      if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
> >          /*
> > @@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
> >       */
> >      if (!ret && can_retry) {
> >          /* pick the next kmalloc bucket */
> > -        size = s->object_size + 1;
> > +        bucket_size = s->object_size + 1;
> >          /*
> >           * Another alternative is to
> >           * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
> 
> -- 
> Cheers,
> Harry / Hyeonggon

--
With Best Regards,
Shengming
Re: [PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path
Posted by Vlastimil Babka (SUSE) 3 days, 22 hours ago
On 6/4/26 07:46, Harry Yoo wrote:
> 
> 
> On 6/3/26 10:10 PM, hu.shengming@zte.com.cn wrote:
>> From: Shengming Hu <hu.shengming@zte.com.cn>
>> 
>> _kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
>> initial allocation fails. The retry currently reuses `size` as the
>> bucket selector and overwrites it with s->object_size + 1.
>> 
>> That value is later passed as the original allocation size to
>> __slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
>> successful retry this makes KASAN/slub-debug observe the retry bucket
>> selector rather than the caller requested size, potentially widening the
>> valid kmalloc range and hiding overflows.
> 
> Good catch!
> 
>> Keep a separate `bucket_size` for choosing the retry cache and preserve
>> `size`.
>> 
>> Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
>> Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
>> ---
> 
> I want to note that this conflicts with Vlastimil's work-in-progress
> feature [1] where it separately stores orig_size in struct
> slab_alloc_context which naturally solves the problem.

Thanks, didn't realize it was solving this problem as a side-effect, hehe.

> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags
> 
> No strong opinion on whether to queue this patch and then rebase
> Vlastimil's work on top, or wait for Vlastimil's work to solve the
> problem instead...

It's better to queue fixes separately first, in case someone wants to
backport them, even though stable shouldn't be really necessary here.
This fix could still go to 7.2 but my series only to 7.3.

>>  mm/slub.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 67abbbf68fc1..6a2b3ade3611 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
>>  void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
>>  {
>>  	gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
>> +	size_t bucket_size = size;
>>  	struct kmem_cache *s;
>>  	bool can_retry = true;
>>  	void *ret;
> 
> But in either way, I think it's more straightforward to introduce
> orig_size as a variable to keep the original size and pass it to
> __slab_alloc_node().

Agree. Because above, we wouldn't initialize bucket_size to a real bucket
size, but a <=bucket size so it's misleading.

>> @@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
>>  		return NULL;
>> 
>>  retry:
>> -	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
>> +	if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
>>  		return NULL;
>> -	s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
>> +	s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
>> 
>>  	if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
>>  		/*
>> @@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
>>  	 */
>>  	if (!ret && can_retry) {
>>  		/* pick the next kmalloc bucket */
>> -		size = s->object_size + 1;
>> +		bucket_size = s->object_size + 1;
>>  		/*
>>  		 * Another alternative is to
>>  		 * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
>
Re: [PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path
Posted by hu.shengming@zte.com.cn 3 days, 20 hours ago
VlastimilBabka(SUSE)<vbabka@kernel.org> wrote:
> On 6/4/26 07:46, Harry Yoo wrote:
> > 
> > 
> > On 6/3/26 10:10 PM, hu.shengming@zte.com.cn wrote:
> >> From: Shengming Hu <hu.shengming@zte.com.cn>
> >> 
> >> _kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
> >> initial allocation fails. The retry currently reuses `size` as the
> >> bucket selector and overwrites it with s->object_size + 1.
> >> 
> >> That value is later passed as the original allocation size to
> >> __slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
> >> successful retry this makes KASAN/slub-debug observe the retry bucket
> >> selector rather than the caller requested size, potentially widening the
> >> valid kmalloc range and hiding overflows.
> > 
> > Good catch!
> > 
> >> Keep a separate `bucket_size` for choosing the retry cache and preserve
> >> `size`.
> >> 
> >> Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
> >> Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn>
> >> ---
> > 
> > I want to note that this conflicts with Vlastimil's work-in-progress
> > feature [1] where it separately stores orig_size in struct
> > slab_alloc_context which naturally solves the problem.
> 
> Thanks, didn't realize it was solving this problem as a side-effect, hehe.
> 
> > [1]
> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags
> > 
> > No strong opinion on whether to queue this patch and then rebase
> > Vlastimil's work on top, or wait for Vlastimil's work to solve the
> > problem instead...
> 
> It's better to queue fixes separately first, in case someone wants to
> backport them, even though stable shouldn't be really necessary here.
> This fix could still go to 7.2 but my series only to 7.3.
> 

Hi Vlastimil,

I'll send a v2, it should be easy to rebase your
slab_alloc_context work on top.

> >>  mm/slub.c | 7 ++++---
> >>  1 file changed, 4 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index 67abbbf68fc1..6a2b3ade3611 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> >> @@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
> >>  void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
> >>  {
> >>      gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
> >> +    size_t bucket_size = size;
> >>      struct kmem_cache *s;
> >>      bool can_retry = true;
> >>      void *ret;
> > 
> > But in either way, I think it's more straightforward to introduce
> > orig_size as a variable to keep the original size and pass it to
> > __slab_alloc_node().
> 
> Agree. Because above, we wouldn't initialize bucket_size to a real bucket
> size, but a <=bucket size so it's misleading.
> 

Agreed, it makes sense.

> >> @@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
> >>          return NULL;
> >> 
> >>  retry:
> >> -    if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> >> +    if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
> >>          return NULL;
> >> -    s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> >> +    s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> >> 
> >>      if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
> >>          /*
> >> @@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
> >>       */
> >>      if (!ret && can_retry) {
> >>          /* pick the next kmalloc bucket */
> >> -        size = s->object_size + 1;
> >> +        bucket_size = s->object_size + 1;
> >>          /*
> >>           * Another alternative is to
> >>           * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
> >