[PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled

Stefan Klug posted 4 patches 1 month ago
There is a newer version of this series
[PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Stefan Klug 1 month ago
On kernels with PREEMPT_RT enabled, a "BUG: scheduling while atomic"
kernel oops occurs inside dw100_irq_handler -> vb2_buffer_done. This is
because vb2_buffer_done takes a spinlock which is not allowed within
interrupt context on PREEMPT_RT.

Fix that by making the irq handler threaded. The threaded interrupt
handling might cause the interrupt line to be disabled a little longer
than before. As the line is not shared, this has no negative side
effects.

Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com>
---
 drivers/media/platform/nxp/dw100/dw100.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/nxp/dw100/dw100.c b/drivers/media/platform/nxp/dw100/dw100.c
index 8a421059a1c9b55f514a29d3c2c5a6ffb76e0a64..4f5ef70e5f4a052fb5f208e35f8785f9d30dc54e 100644
--- a/drivers/media/platform/nxp/dw100/dw100.c
+++ b/drivers/media/platform/nxp/dw100/dw100.c
@@ -1600,8 +1600,9 @@ static int dw100_probe(struct platform_device *pdev)
 
 	pm_runtime_put_sync(&pdev->dev);
 
-	ret = devm_request_irq(&pdev->dev, irq, dw100_irq_handler, IRQF_ONESHOT,
-			       dev_name(&pdev->dev), dw_dev);
+	ret = devm_request_threaded_irq(&pdev->dev, irq, NULL,
+					dw100_irq_handler, IRQF_ONESHOT,
+					dev_name(&pdev->dev), dw_dev);
 	if (ret < 0) {
 		dev_err(&pdev->dev, "Failed to request irq: %d\n", ret);
 		goto err_pm;

-- 
2.51.0
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Nicolas Dufresne 1 month ago
Hi,

Le lundi 05 janvier 2026 à 12:35 +0100, Stefan Klug a écrit :
> On kernels with PREEMPT_RT enabled, a "BUG: scheduling while atomic"
> kernel oops occurs inside dw100_irq_handler -> vb2_buffer_done. This is
> because vb2_buffer_done takes a spinlock which is not allowed within
> interrupt context on PREEMPT_RT.
> 
> Fix that by making the irq handler threaded. The threaded interrupt
> handling might cause the interrupt line to be disabled a little longer
> than before. As the line is not shared, this has no negative side
> effects.

That's interesting, do you plan to update more drivers ? There is a lot of m2m
using hard IRQ to minimize the idle time (save a context switch), but RT support
might be more worth then that.

> 
> Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com>

Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

> ---
>  drivers/media/platform/nxp/dw100/dw100.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/media/platform/nxp/dw100/dw100.c
> b/drivers/media/platform/nxp/dw100/dw100.c
> index
> 8a421059a1c9b55f514a29d3c2c5a6ffb76e0a64..4f5ef70e5f4a052fb5f208e35f8785f9d30d
> c54e 100644
> --- a/drivers/media/platform/nxp/dw100/dw100.c
> +++ b/drivers/media/platform/nxp/dw100/dw100.c
> @@ -1600,8 +1600,9 @@ static int dw100_probe(struct platform_device *pdev)
>  
>  	pm_runtime_put_sync(&pdev->dev);
>  
> -	ret = devm_request_irq(&pdev->dev, irq, dw100_irq_handler,
> IRQF_ONESHOT,
> -			       dev_name(&pdev->dev), dw_dev);
> +	ret = devm_request_threaded_irq(&pdev->dev, irq, NULL,
> +					dw100_irq_handler, IRQF_ONESHOT,
> +					dev_name(&pdev->dev), dw_dev);
>  	if (ret < 0) {
>  		dev_err(&pdev->dev, "Failed to request irq: %d\n", ret);
>  		goto err_pm;
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Laurent Pinchart 1 month ago
On Mon, Jan 05, 2026 at 02:02:38PM -0500, Nicolas Dufresne wrote:
> Le lundi 05 janvier 2026 à 12:35 +0100, Stefan Klug a écrit :
> > On kernels with PREEMPT_RT enabled, a "BUG: scheduling while atomic"
> > kernel oops occurs inside dw100_irq_handler -> vb2_buffer_done. This is
> > because vb2_buffer_done takes a spinlock which is not allowed within
> > interrupt context on PREEMPT_RT.
> > 
> > Fix that by making the irq handler threaded. The threaded interrupt
> > handling might cause the interrupt line to be disabled a little longer
> > than before. As the line is not shared, this has no negative side
> > effects.
> 
> That's interesting, do you plan to update more drivers ? There is a lot of m2m
> using hard IRQ to minimize the idle time (save a context switch), but RT support
> might be more worth then that.

This is a part of PREEMPT_RT that puzzles me. By turning regular
spinlocks into mutexes, RT seems to break drivers that use those
spinlocks in hard IRQ handlers. That's a very large number of drivers
given how widespread regular spinlock usage is. Do drivers need to be
manually converted to either raw spinlocks or threaded IRQ handlers ?
What about non-RT kernels, how can a driver avoid the thread scheduling
penalty in those cases, do they need to manually select between
request_irq() and request_threaded_irq() based on if RT is enabled ?
This puzzles me, it feels like I must be missing something.

> > Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com>
> 
> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> 
> > ---
> >  drivers/media/platform/nxp/dw100/dw100.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/media/platform/nxp/dw100/dw100.c
> > b/drivers/media/platform/nxp/dw100/dw100.c
> > index
> > 8a421059a1c9b55f514a29d3c2c5a6ffb76e0a64..4f5ef70e5f4a052fb5f208e35f8785f9d30d
> > c54e 100644
> > --- a/drivers/media/platform/nxp/dw100/dw100.c
> > +++ b/drivers/media/platform/nxp/dw100/dw100.c
> > @@ -1600,8 +1600,9 @@ static int dw100_probe(struct platform_device *pdev)
> >  
> >  	pm_runtime_put_sync(&pdev->dev);
> >  
> > -	ret = devm_request_irq(&pdev->dev, irq, dw100_irq_handler, IRQF_ONESHOT,
> > -			       dev_name(&pdev->dev), dw_dev);
> > +	ret = devm_request_threaded_irq(&pdev->dev, irq, NULL,
> > +					dw100_irq_handler, IRQF_ONESHOT,
> > +					dev_name(&pdev->dev), dw_dev);
> >  	if (ret < 0) {
> >  		dev_err(&pdev->dev, "Failed to request irq: %d\n", ret);
> >  		goto err_pm;

-- 
Regards,

Laurent Pinchart
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Steven Rostedt 1 month ago
On Tue, 6 Jan 2026 01:59:21 +0200
Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote:

> > That's interesting, do you plan to update more drivers ? There is a lot of m2m
> > using hard IRQ to minimize the idle time (save a context switch), but RT support
> > might be more worth then that.  
> 
> This is a part of PREEMPT_RT that puzzles me. By turning regular
> spinlocks into mutexes, RT seems to break drivers that use those
> spinlocks in hard IRQ handlers. That's a very large number of drivers
> given how widespread regular spinlock usage is. Do drivers need to be
> manually converted to either raw spinlocks or threaded IRQ handlers ?

No. Pretty much all interrupts are converted into threaded interrupt
handlers unless IRQF_NO_THREAD, IRQF_PERCPU, or IRQF_ONESHOT are specified.

The interrupt line is disabled until the thread handler is called.


> What about non-RT kernels, how can a driver avoid the thread scheduling
> penalty in those cases, do they need to manually select between
> request_irq() and request_threaded_irq() based on if RT is enabled ?
> This puzzles me, it feels like I must be missing something.

The issue here is that the interrupt handler specifies ONESHOT which causes
the handler to be executed in hard interrupt context.

-- Steve
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Laurent Pinchart 1 month ago
On Mon, Jan 05, 2026 at 07:39:33PM -0500, Steven Rostedt wrote:
> On Tue, 6 Jan 2026 01:59:21 +0200 Laurent Pinchart wrote:
> 
> > > That's interesting, do you plan to update more drivers ? There is a lot of m2m
> > > using hard IRQ to minimize the idle time (save a context switch), but RT support
> > > might be more worth then that.  
> > 
> > This is a part of PREEMPT_RT that puzzles me. By turning regular
> > spinlocks into mutexes, RT seems to break drivers that use those
> > spinlocks in hard IRQ handlers. That's a very large number of drivers
> > given how widespread regular spinlock usage is. Do drivers need to be
> > manually converted to either raw spinlocks or threaded IRQ handlers ?
> 
> No. Pretty much all interrupts are converted into threaded interrupt
> handlers unless IRQF_NO_THREAD, IRQF_PERCPU, or IRQF_ONESHOT are specified.
> 
> The interrupt line is disabled until the thread handler is called.
> 
> > What about non-RT kernels, how can a driver avoid the thread scheduling
> > penalty in those cases, do they need to manually select between
> > request_irq() and request_threaded_irq() based on if RT is enabled ?
> > This puzzles me, it feels like I must be missing something.
> 
> The issue here is that the interrupt handler specifies ONESHOT which causes
> the handler to be executed in hard interrupt context.

Gotcha.

Stefan, please explain in the commit message why the ONESHOT flag is
set by the driver.

-- 
Regards,

Laurent Pinchart
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Stefan Klug 1 month ago
Quoting Laurent Pinchart (2026-01-06 01:49:28)
> On Mon, Jan 05, 2026 at 07:39:33PM -0500, Steven Rostedt wrote:
> > On Tue, 6 Jan 2026 01:59:21 +0200 Laurent Pinchart wrote:
> > 
> > > > That's interesting, do you plan to update more drivers ? There is a lot of m2m
> > > > using hard IRQ to minimize the idle time (save a context switch), but RT support
> > > > might be more worth then that.  
> > > 
> > > This is a part of PREEMPT_RT that puzzles me. By turning regular
> > > spinlocks into mutexes, RT seems to break drivers that use those
> > > spinlocks in hard IRQ handlers. That's a very large number of drivers
> > > given how widespread regular spinlock usage is. Do drivers need to be
> > > manually converted to either raw spinlocks or threaded IRQ handlers ?
> > 
> > No. Pretty much all interrupts are converted into threaded interrupt
> > handlers unless IRQF_NO_THREAD, IRQF_PERCPU, or IRQF_ONESHOT are specified.
> > 
> > The interrupt line is disabled until the thread handler is called.
> > 
> > > What about non-RT kernels, how can a driver avoid the thread scheduling
> > > penalty in those cases, do they need to manually select between
> > > request_irq() and request_threaded_irq() based on if RT is enabled ?
> > > This puzzles me, it feels like I must be missing something.
> > 
> > The issue here is that the interrupt handler specifies ONESHOT which causes
> > the handler to be executed in hard interrupt context.
> 
> Gotcha.
> 
> Stefan, please explain in the commit message why the ONESHOT flag is
> set by the driver.

Hah, if I knew that :-).

The pieces I have are:
In the DT the interrupt line is marked as IRQ_TYPE_LEVEL_HIGH. I don't
know why and couldn't find a reference to that in the reference manual.
Assuming it is a level interrupt, then it makes sense to treat it as ONESHOT,
otherwise it would fire again immediately after handling the hard
interrupt...but it was a hard interrupt in first place - huh.

I just realize that I still miss a bit of the puzzle:
ONESHOT is doumented as:

"Interrupt is not reenabled after the hardirq handler finished. Used by
threaded interrupts which need to keep the irq line disabled until the
threaded handler has been run."

That makes perfect sense. So ONESHOT disables the irq line until the
thread_fn has completed (if it was set). Now on preempt_rt inside
irq_setup_forced_threading() we don't force threading if ONESHOT is
requested. Why is that?

So I'm left with two questions:
- Why aren't ONESHOT irq handlers forced to threaded on preempt_rt?
- Why was ONESHOT requested in first place as to my current knowledge it
  really only makes sense if a thread_fn is defined.

Did I just answer my own question? ONESHOT only makes sense if there is
a thread_fn and it is assumed that the hard handler is necessary. So
preempt_rt doesn't try to change that?

That would mean the ONESHOT in the dw100 was not necessary in first
place but didn't do any harm until preempt_rt was enabled... And if
ONSHOT is *not* set preempt_rt would automatically force the irq handler
to be threaded and set the ONESHOT flag in irq_setup_forced_threading().

So everything would be fine except that we'd still hit the timeout issue
from patch 4/4.

So if I got that right, the dw100 driver is in the unfortunate
situation, that the irq handler consists of two parts where the first
part *must* run in hard interrupt context and the second part *should* run
in hard interrupt context but it is fine if it becomes threaded due to
preempt_rt. As we can't model that, the best we can do is to always run
the second part threaded...

So patch 4/4 seems correct until we get new information about the
hardware.

Any thoughts?

Best regards,
Stefan

> 
> -- 
> Regards,
> 
> Laurent Pinchart
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Sebastian Andrzej Siewior 3 weeks, 4 days ago
On 2026-01-06 18:11:27 [+0100], Stefan Klug wrote:
> Hah, if I knew that :-).
> 
> The pieces I have are:
> In the DT the interrupt line is marked as IRQ_TYPE_LEVEL_HIGH. I don't
> know why and couldn't find a reference to that in the reference manual.

It is either a LEVEL interrupt or just marked as such. But it seems to
behave as such.

> Assuming it is a level interrupt, then it makes sense to treat it as ONESHOT,
> otherwise it would fire again immediately after handling the hard
> interrupt...but it was a hard interrupt in first place - huh.

So setting it ONESHOT while it is non-threaded does not seem to make
sense, correct.

> I just realize that I still miss a bit of the puzzle:
> ONESHOT is doumented as:
> 
> "Interrupt is not reenabled after the hardirq handler finished. Used by
> threaded interrupts which need to keep the irq line disabled until the
> threaded handler has been run."
> 
> That makes perfect sense. So ONESHOT disables the irq line until the
> thread_fn has completed (if it was set). Now on preempt_rt inside
> irq_setup_forced_threading() we don't force threading if ONESHOT is
> requested. Why is that?

Because ONESHOT is usually used where there is no primary handler/ the
primary handler does just a wake of thread.

> So I'm left with two questions:
> - Why aren't ONESHOT irq handlers forced to threaded on preempt_rt?

See above. Also PREEMPT_RT just enforces the kernel command line
threadirqs

> - Why was ONESHOT requested in first place as to my current knowledge it
>   really only makes sense if a thread_fn is defined.

I would say it was a mistake and nobody noticed it. There is no visible
difference if there is just the primary handler and the system does not
use threadirqs (or PREEMPT_RT which enforces it).

> Did I just answer my own question? ONESHOT only makes sense if there is
> a thread_fn and it is assumed that the hard handler is necessary. So
> preempt_rt doesn't try to change that?

Yes. ONESHOT is used if the interrupt source within the IRQ chip has to
be masked until after the thread completed. So setting ONESHOT without a
threaded handler is dubious.

> That would mean the ONESHOT in the dw100 was not necessary in first
> place but didn't do any harm until preempt_rt was enabled... And if
> ONSHOT is *not* set preempt_rt would automatically force the irq handler
> to be threaded and set the ONESHOT flag in irq_setup_forced_threading().

correct.

> So everything would be fine except that we'd still hit the timeout issue
> from patch 4/4.
> 
> So if I got that right, the dw100 driver is in the unfortunate
> situation, that the irq handler consists of two parts where the first
> part *must* run in hard interrupt context and the second part *should* run
> in hard interrupt context but it is fine if it becomes threaded due to
> preempt_rt. As we can't model that, the best we can do is to always run
> the second part threaded...

So happens if you avoid the IRQF_ONESHOT? Do you still get these
timeout errors?

> So patch 4/4 seems correct until we get new information about the
> hardware.
> 
> Any thoughts?
> 
> Best regards,
> Stefan

Sebastian
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Stefan Klug 3 weeks, 2 days ago
Hi Sebastian,

Thanks for your support.

Quoting Sebastian Andrzej Siewior (2026-01-12 12:43:13)
> On 2026-01-06 18:11:27 [+0100], Stefan Klug wrote:
> > Hah, if I knew that :-).
> > 
> > The pieces I have are:
> > In the DT the interrupt line is marked as IRQ_TYPE_LEVEL_HIGH. I don't
> > know why and couldn't find a reference to that in the reference manual.
> 
> It is either a LEVEL interrupt or just marked as such. But it seems to
> behave as such.
> 
> > Assuming it is a level interrupt, then it makes sense to treat it as ONESHOT,
> > otherwise it would fire again immediately after handling the hard
> > interrupt...but it was a hard interrupt in first place - huh.
> 
> So setting it ONESHOT while it is non-threaded does not seem to make
> sense, correct.
> 
> > I just realize that I still miss a bit of the puzzle:
> > ONESHOT is doumented as:
> > 
> > "Interrupt is not reenabled after the hardirq handler finished. Used by
> > threaded interrupts which need to keep the irq line disabled until the
> > threaded handler has been run."
> > 
> > That makes perfect sense. So ONESHOT disables the irq line until the
> > thread_fn has completed (if it was set). Now on preempt_rt inside
> > irq_setup_forced_threading() we don't force threading if ONESHOT is
> > requested. Why is that?
> 
> Because ONESHOT is usually used where there is no primary handler/ the
> primary handler does just a wake of thread.
> 
> > So I'm left with two questions:
> > - Why aren't ONESHOT irq handlers forced to threaded on preempt_rt?
> 
> See above. Also PREEMPT_RT just enforces the kernel command line
> threadirqs
> 
> > - Why was ONESHOT requested in first place as to my current knowledge it
> >   really only makes sense if a thread_fn is defined.
> 
> I would say it was a mistake and nobody noticed it. There is no visible
> difference if there is just the primary handler and the system does not
> use threadirqs (or PREEMPT_RT which enforces it).
> 
> > Did I just answer my own question? ONESHOT only makes sense if there is
> > a thread_fn and it is assumed that the hard handler is necessary. So
> > preempt_rt doesn't try to change that?
> 
> Yes. ONESHOT is used if the interrupt source within the IRQ chip has to
> be masked until after the thread completed. So setting ONESHOT without a
> threaded handler is dubious.
> 
> > That would mean the ONESHOT in the dw100 was not necessary in first
> > place but didn't do any harm until preempt_rt was enabled... And if
> > ONSHOT is *not* set preempt_rt would automatically force the irq handler
> > to be threaded and set the ONESHOT flag in irq_setup_forced_threading().
> 
> correct.
> 
> > So everything would be fine except that we'd still hit the timeout issue
> > from patch 4/4.
> > 
> > So if I got that right, the dw100 driver is in the unfortunate
> > situation, that the irq handler consists of two parts where the first
> > part *must* run in hard interrupt context and the second part *should* run
> > in hard interrupt context but it is fine if it becomes threaded due to
> > preempt_rt. As we can't model that, the best we can do is to always run
> > the second part threaded...
> 
> So happens if you avoid the IRQF_ONESHOT? Do you still get these
> timeout errors?

I did a bit more testing and got results that I fail to completely
understand.

If I enable IRQF_ONESHOT and use the threaded_fn, on a non PREEMPT_RT
system I regularly observe the timeout message.

If I pass irqflags=0 and use the hard handler on a PREEMPT_RT system I
expected the same behavior (as the hard handler gets changed to be
threaded and implicitely ONESHOT is set). But I don't see the timeout
messages.

Is there anything else that I need to do on PREEMPT_RT to force the
threaded behavior besides enabling the config? Or is the irq thread
running with higher priority and therefore possibly faster?

Running irqflags=0 and the hard handler on a non PREEMPT_RT system
didn't have any negative side effects. So maybe that is really the
solution...

I'll ping Xavier if he has more details on the hardware.

Best regards,
Stefan

> 
> > So patch 4/4 seems correct until we get new information about the
> > hardware.
> > 
> > Any thoughts?
> > 
> > Best regards,
> > Stefan
> 
> Sebastian
>
Re: [PATCH 3/4] media: dw100: Fix kernel oops with PREEMPT_RT enabled
Posted by Sebastian Andrzej Siewior 2 weeks ago
On 2026-01-14 18:22:34 [+0100], Stefan Klug wrote:
> Hi Sebastian,
Hi,

sorry for being late…

> I did a bit more testing and got results that I fail to completely
> understand.
> 
> If I enable IRQF_ONESHOT and use the threaded_fn, on a non PREEMPT_RT
> system I regularly observe the timeout message.
> 
> If I pass irqflags=0 and use the hard handler on a PREEMPT_RT system I
> expected the same behavior (as the hard handler gets changed to be
> threaded and implicitely ONESHOT is set). But I don't see the timeout
> messages.

If you have the driver with the removed ONESHOT and boot !RT with
threadirqs then you should have the same behaviour. RT threads all
handler not just one. This might change the behaviour if all interrupts
are served by one CPU and some interrupts are handled prior and have an
effect on this (threaded) one.

> Is there anything else that I need to do on PREEMPT_RT to force the
> threaded behavior besides enabling the config? Or is the irq thread
> running with higher priority and therefore possibly faster?

In both configurations (RT and !RT with threaded interrupts) the
interrupt handed is running at SCHED_FIFO prio 50. This has a higher
priority than the "normal" userland which runs at SCHED_OTHER but the
same priority as the remaining threaded interrupts.

> Running irqflags=0 and the hard handler on a non PREEMPT_RT system
> didn't have any negative side effects. So maybe that is really the
> solution...
> 
> I'll ping Xavier if he has more details on the hardware.
> 
> Best regards,
> Stefan

Sebastian