[PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT

Swaraj Gaikwad posted 1 patch 3 weeks, 4 days ago
mm/slub.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
[PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT
Posted by Swaraj Gaikwad 3 weeks, 4 days ago
On PREEMPT_RT kernels, local_lock becomes a sleeping lock. The current
check in kmalloc_nolock() only verifies we're not in NMI or hard IRQ
context, but misses the case where preemption is disabled.

When a BPF program runs from a tracepoint with preemption disabled
(preempt_count > 0), kmalloc_nolock() proceeds to call
local_lock_irqsave() which attempts to acquire a sleeping lock,
triggering:

  BUG: sleeping function called from invalid context
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6128
  preempt_count: 2, expected: 0

Fix this by checking !preemptible() on PREEMPT_RT, which directly
expresses the constraint that we cannot take a sleeping lock when
preemption is disabled. This encompasses the previous checks for NMI
and hard IRQ contexts while also catching cases where preemption is
disabled.

Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Reported-by: syzbot+b1546ad4a95331b2101e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b1546ad4a95331b2101e
Signed-off-by: Swaraj Gaikwad <swarajgaikwad1925@gmail.com>
---
Changes in v2:
- Simplified condition from (in_nmi() || in_hardirq() || preempt_count())
  to !preemptible() as suggested by Luis Claudio R. Goncalves and agreed
  by Vlastimil Babka
- Updated comment to reflect the more descriptive check

Tested by building with syz config and running the syzbot
reproducer - kernel no longer crashes.

 mm/slub.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 2acce22590f8..642f4744d5c6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5689,8 +5689,12 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
 	if (unlikely(!size))
 		return ZERO_SIZE_PTR;

-	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
-		/* kmalloc_nolock() in PREEMPT_RT is not supported from irq */
+	if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
+		/*
+		 * kmalloc_nolock() in PREEMPT_RT is not supported from
+		 * non-preemptible context because local_lock becomes a
+		 * sleeping lock on RT.
+		 */
 		return NULL;
 retry:
 	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))

base-commit: 559e608c46553c107dbba19dae0854af7b219400
--
2.52.0
Re: [PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT
Posted by Vlastimil Babka 3 weeks, 4 days ago
On 1/13/26 16:06, Swaraj Gaikwad wrote:
> On PREEMPT_RT kernels, local_lock becomes a sleeping lock. The current
> check in kmalloc_nolock() only verifies we're not in NMI or hard IRQ
> context, but misses the case where preemption is disabled.
> 
> When a BPF program runs from a tracepoint with preemption disabled
> (preempt_count > 0), kmalloc_nolock() proceeds to call
> local_lock_irqsave() which attempts to acquire a sleeping lock,
> triggering:
> 
>   BUG: sleeping function called from invalid context
>   in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6128
>   preempt_count: 2, expected: 0
> 
> Fix this by checking !preemptible() on PREEMPT_RT, which directly
> expresses the constraint that we cannot take a sleeping lock when
> preemption is disabled. This encompasses the previous checks for NMI
> and hard IRQ contexts while also catching cases where preemption is
> disabled.
> 
> Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> Reported-by: syzbot+b1546ad4a95331b2101e@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b1546ad4a95331b2101e
> Signed-off-by: Swaraj Gaikwad <swarajgaikwad1925@gmail.com>

Added to slab/for-next-fixes, thanks!

> ---
> Changes in v2:
> - Simplified condition from (in_nmi() || in_hardirq() || preempt_count())
>   to !preemptible() as suggested by Luis Claudio R. Goncalves and agreed
>   by Vlastimil Babka
> - Updated comment to reflect the more descriptive check
> 
> Tested by building with syz config and running the syzbot
> reproducer - kernel no longer crashes.
> 
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 2acce22590f8..642f4744d5c6 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5689,8 +5689,12 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
>  	if (unlikely(!size))
>  		return ZERO_SIZE_PTR;
> 
> -	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> -		/* kmalloc_nolock() in PREEMPT_RT is not supported from irq */
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> +		/*
> +		 * kmalloc_nolock() in PREEMPT_RT is not supported from
> +		 * non-preemptible context because local_lock becomes a
> +		 * sleeping lock on RT.
> +		 */
>  		return NULL;
>  retry:
>  	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> 
> base-commit: 559e608c46553c107dbba19dae0854af7b219400
> --
> 2.52.0
>
Re: [PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT
Posted by Harry Yoo 3 weeks, 4 days ago
On Tue, Jan 13, 2026 at 08:36:39PM +0530, Swaraj Gaikwad wrote:
> On PREEMPT_RT kernels, local_lock becomes a sleeping lock. The current
> check in kmalloc_nolock() only verifies we're not in NMI or hard IRQ
> context, but misses the case where preemption is disabled.
> 
> When a BPF program runs from a tracepoint with preemption disabled
> (preempt_count > 0), kmalloc_nolock() proceeds to call
> local_lock_irqsave() which attempts to acquire a sleeping lock,
> triggering:
> 
>   BUG: sleeping function called from invalid context
>   in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6128
>   preempt_count: 2, expected: 0
> 
> Fix this by checking !preemptible() on PREEMPT_RT, which directly
> expresses the constraint that we cannot take a sleeping lock when
> preemption is disabled. This encompasses the previous checks for NMI
> and hard IRQ contexts while also catching cases where preemption is
> disabled.
> 
> Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> Reported-by: syzbot+b1546ad4a95331b2101e@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b1546ad4a95331b2101e
> Signed-off-by: Swaraj Gaikwad <swarajgaikwad1925@gmail.com>
> ---

Acked-by: Harry Yoo <harry.yoo@oracle.com>

-- 
Cheers,
Harry / Hyeonggon
Re: [PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT
Posted by Sebastian Andrzej Siewior 3 weeks, 4 days ago
On 2026-01-13 20:36:39 [+0530], Swaraj Gaikwad wrote:
> On PREEMPT_RT kernels, local_lock becomes a sleeping lock. The current
> check in kmalloc_nolock() only verifies we're not in NMI or hard IRQ
> context, but misses the case where preemption is disabled.

The reasoning was different back then.

> When a BPF program runs from a tracepoint with preemption disabled
> (preempt_count > 0), kmalloc_nolock() proceeds to call
> local_lock_irqsave() which attempts to acquire a sleeping lock,
> triggering:
> 
>   BUG: sleeping function called from invalid context
>   in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6128
>   preempt_count: 2, expected: 0
> 
> Fix this by checking !preemptible() on PREEMPT_RT, which directly
> expresses the constraint that we cannot take a sleeping lock when
> preemption is disabled. This encompasses the previous checks for NMI
> and hard IRQ contexts while also catching cases where preemption is
> disabled.
> 
> Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> Reported-by: syzbot+b1546ad4a95331b2101e@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b1546ad4a95331b2101e
> Signed-off-by: Swaraj Gaikwad <swarajgaikwad1925@gmail.com>
> ---

Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

for now.

> Changes in v2:
> - Simplified condition from (in_nmi() || in_hardirq() || preempt_count())
>   to !preemptible() as suggested by Luis Claudio R. Goncalves and agreed
>   by Vlastimil Babka
> - Updated comment to reflect the more descriptive check
> 
> Tested by building with syz config and running the syzbot
> reproducer - kernel no longer crashes.
> 
>  mm/slub.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 2acce22590f8..642f4744d5c6 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5689,8 +5689,12 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
>  	if (unlikely(!size))
>  		return ZERO_SIZE_PTR;
> 
> -	if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> -		/* kmalloc_nolock() in PREEMPT_RT is not supported from irq */
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> +		/*
> +		 * kmalloc_nolock() in PREEMPT_RT is not supported from
> +		 * non-preemptible context because local_lock becomes a
> +		 * sleeping lock on RT.

I would say that despite the _nolock() suffix a local_lock() is still
acquired. The !PREEMPT_RT does a trylock.

As I noticed this myself today while looking at other patches, was the
trylock removed on RT by accident, was it there only in an earlier
version which was never merged and will it ever come back so we can go
back to !nmi || !hardirq?

> +		 */
>  		return NULL;
>  retry:
>  	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> 

Sebastian
Re: [PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT
Posted by Vlastimil Babka 3 weeks, 4 days ago
On 1/13/26 19:00, Sebastian Andrzej Siewior wrote:
> 
> I would say that despite the _nolock() suffix a local_lock() is still
> acquired. The !PREEMPT_RT does a trylock.
> 
> As I noticed this myself today while looking at other patches, was the
> trylock removed on RT by accident, was it there only in an earlier
> version which was never merged and will it ever come back so we can go
> back to !nmi || !hardirq?

IIRC there was no version that would do always a trylock on RT (or maybe
there was some early one but run into trouble quickly?). The problem was
converting the slub code to deal with situations where initially trylock
suceeds in the given context, but then it's dropped and later needed again,
and failing that later trylock would be too complex to unwind. So instead we
do the local_lock_is_locked() check upfront and then trust that all nested
local_lock_cpu_slab()'s can't fail. And unfortunately this doesn't very play
well with RT semantics.

> 
>> +		 */
>>  		return NULL;
>>  retry:
>>  	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
>> 
> 
> Sebastian
Re: [PATCH v2] slab: fix kmalloc_nolock() context check for PREEMPT_RT
Posted by Alexei Starovoitov 3 weeks, 4 days ago
On Tue, Jan 13, 2026 at 10:00 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2026-01-13 20:36:39 [+0530], Swaraj Gaikwad wrote:
> > On PREEMPT_RT kernels, local_lock becomes a sleeping lock. The current
> > check in kmalloc_nolock() only verifies we're not in NMI or hard IRQ
> > context, but misses the case where preemption is disabled.
>
> The reasoning was different back then.
>
> > When a BPF program runs from a tracepoint with preemption disabled
> > (preempt_count > 0), kmalloc_nolock() proceeds to call
> > local_lock_irqsave() which attempts to acquire a sleeping lock,
> > triggering:
> >
> >   BUG: sleeping function called from invalid context
> >   in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6128
> >   preempt_count: 2, expected: 0
> >
> > Fix this by checking !preemptible() on PREEMPT_RT, which directly
> > expresses the constraint that we cannot take a sleeping lock when
> > preemption is disabled. This encompasses the previous checks for NMI
> > and hard IRQ contexts while also catching cases where preemption is
> > disabled.
> >
> > Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> > Reported-by: syzbot+b1546ad4a95331b2101e@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=b1546ad4a95331b2101e
> > Signed-off-by: Swaraj Gaikwad <swarajgaikwad1925@gmail.com>
> > ---
>
> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
> for now.
>
> > Changes in v2:
> > - Simplified condition from (in_nmi() || in_hardirq() || preempt_count())
> >   to !preemptible() as suggested by Luis Claudio R. Goncalves and agreed
> >   by Vlastimil Babka
> > - Updated comment to reflect the more descriptive check
> >
> > Tested by building with syz config and running the syzbot
> > reproducer - kernel no longer crashes.
> >
> >  mm/slub.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 2acce22590f8..642f4744d5c6 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -5689,8 +5689,12 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
> >       if (unlikely(!size))
> >               return ZERO_SIZE_PTR;
> >
> > -     if (IS_ENABLED(CONFIG_PREEMPT_RT) && (in_nmi() || in_hardirq()))
> > -             /* kmalloc_nolock() in PREEMPT_RT is not supported from irq */
> > +     if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> > +             /*
> > +              * kmalloc_nolock() in PREEMPT_RT is not supported from
> > +              * non-preemptible context because local_lock becomes a
> > +              * sleeping lock on RT.
>
> I would say that despite the _nolock() suffix a local_lock() is still
> acquired. The !PREEMPT_RT does a trylock.
>
> As I noticed this myself today while looking at other patches, was the
> trylock removed on RT by accident, was it there only in an earlier
> version which was never merged and will it ever come back so we can go
> back to !nmi || !hardirq?

The root cause of this syzbot splat is preempt_disable() in
trace_virtio_transport_alloc_pkt() that is being fixed separately.
I guess this patch doesn't hurt, but I suspect with tracepoints
moving to srcu_fast syzbot won't be able to find
preempt_disable() + kmalloc_nolock() case

Acked-by: Alexei Starovoitov <ast@kernel.org>

for now :)
until shaves come.