[PATCH v2] smc91x: fix broken irq-context in PREEMPT_RT

Yeoreum Yun posted 1 patch 1 month, 3 weeks ago
There is a newer version of this series
drivers/net/ethernet/smsc/smc91x.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
[PATCH v2] smc91x: fix broken irq-context in PREEMPT_RT
Posted by Yeoreum Yun 1 month, 3 weeks ago
When smc91x.c is built with PREEMPT_RT, the following splat occurs
in FVP_RevC:

[   13.055000] smc91x LNRO0003:00 eth0: link up, 10Mbps, half-duplex, lpa 0x0000
[   13.062137] BUG: workqueue leaked atomic, lock or RCU: kworker/2:1[106]
[   13.062137]      preempt=0x00000000 lock=0->0 RCU=0->1 workfn=mld_ifc_work
[   13.062266] C
** replaying previous printk message **
[   13.062266] CPU: 2 UID: 0 PID: 106 Comm: kworker/2:1 Not tainted 6.18.0-dirty #179 PREEMPT_{RT,(full)}
[   13.062353] Hardware name:  , BIOS
[   13.062382] Workqueue: mld mld_ifc_work
[   13.062469] Call trace:
[   13.062494]  show_stack+0x24/0x40 (C)
[   13.062602]  __dump_stack+0x28/0x48
[   13.062710]  dump_stack_lvl+0x7c/0xb0
[   13.062818]  dump_stack+0x18/0x34
[   13.062926]  process_scheduled_works+0x294/0x450
[   13.063043]  worker_thread+0x260/0x3d8
[   13.063124]  kthread+0x1c4/0x228
[   13.063235]  ret_from_fork+0x10/0x20

This happens because smc_special_trylock() disables IRQs even on PREEMPT_RT,
but smc_special_unlock() does not restore IRQs on PREEMPT_RT.
The reason is that smc_special_unlock() calls spin_unlock_irqrestore(),
and rcu_read_unlock_bh() in __dev_queue_xmit() cannot invoke
rcu_read_unlock() through __local_bh_enable_ip() when current->softirq_disable_cnt becomes zero.

To address this issue, replace smc_special_trylock() with spin_trylock_irqsave().

Fixes: 8ff499e43c53 ("smc91x: let smc91x work well under netpoll")
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
This patch based on v6.18.

History
========

From v1 to v2:
  - remove debug log.
  - https://lore.kernel.org/all/20251212185818.2209573-1-yeoreum.yun@arm.com/

---
 drivers/net/ethernet/smsc/smc91x.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c
index 9d1a83a5fa7e..d16c178d1034 100644
--- a/drivers/net/ethernet/smsc/smc91x.c
+++ b/drivers/net/ethernet/smsc/smc91x.c
@@ -516,15 +516,7 @@ static inline void  smc_rcv(struct net_device *dev)
  * any other concurrent access and C would always interrupt B. But life
  * isn't that easy in a SMP world...
  */
-#define smc_special_trylock(lock, flags)				\
-({									\
-	int __ret;							\
-	local_irq_save(flags);						\
-	__ret = spin_trylock(lock);					\
-	if (!__ret)							\
-		local_irq_restore(flags);				\
-	__ret;								\
-})
+#define smc_special_trylock(lock, flags)	spin_trylock_irqsave(lock, flags)
 #define smc_special_lock(lock, flags)		spin_lock_irqsave(lock, flags)
 #define smc_special_unlock(lock, flags) 	spin_unlock_irqrestore(lock, flags)
 #else
--
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
Re: [PATCH v2] smc91x: fix broken irq-context in PREEMPT_RT
Posted by Simon Horman 1 month, 3 weeks ago
On Fri, Dec 12, 2025 at 07:03:38PM +0000, Yeoreum Yun wrote:
> When smc91x.c is built with PREEMPT_RT, the following splat occurs
> in FVP_RevC:
> 
> [   13.055000] smc91x LNRO0003:00 eth0: link up, 10Mbps, half-duplex, lpa 0x0000
> [   13.062137] BUG: workqueue leaked atomic, lock or RCU: kworker/2:1[106]
> [   13.062137]      preempt=0x00000000 lock=0->0 RCU=0->1 workfn=mld_ifc_work
> [   13.062266] C
> ** replaying previous printk message **
> [   13.062266] CPU: 2 UID: 0 PID: 106 Comm: kworker/2:1 Not tainted 6.18.0-dirty #179 PREEMPT_{RT,(full)}
> [   13.062353] Hardware name:  , BIOS
> [   13.062382] Workqueue: mld mld_ifc_work
> [   13.062469] Call trace:
> [   13.062494]  show_stack+0x24/0x40 (C)
> [   13.062602]  __dump_stack+0x28/0x48
> [   13.062710]  dump_stack_lvl+0x7c/0xb0
> [   13.062818]  dump_stack+0x18/0x34
> [   13.062926]  process_scheduled_works+0x294/0x450
> [   13.063043]  worker_thread+0x260/0x3d8
> [   13.063124]  kthread+0x1c4/0x228
> [   13.063235]  ret_from_fork+0x10/0x20
> 
> This happens because smc_special_trylock() disables IRQs even on PREEMPT_RT,
> but smc_special_unlock() does not restore IRQs on PREEMPT_RT.
> The reason is that smc_special_unlock() calls spin_unlock_irqrestore(),
> and rcu_read_unlock_bh() in __dev_queue_xmit() cannot invoke
> rcu_read_unlock() through __local_bh_enable_ip() when current->softirq_disable_cnt becomes zero.
> 
> To address this issue, replace smc_special_trylock() with spin_trylock_irqsave().
> 
> Fixes: 8ff499e43c53 ("smc91x: let smc91x work well under netpoll")
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
> This patch based on v6.18.
> 
> History
> ========
> 
> >From v1 to v2:
>   - remove debug log.
>   - https://lore.kernel.org/all/20251212185818.2209573-1-yeoreum.yun@arm.com/
> 

Firstly, I'd like to note that it seems to me that the last
non-trivial update to this driver seems to have occurred back in 2016.
Do you know if it is still actively used?

I agree that this patch seems appropriate as a bug fix.
But I do wonder if, as a follow-up for net-next when it re-opens,
smc_special_*lock could be removed entirely.
Other than being the source of this bug (which I guess is special),
they don't seem very special anymore. Perhaps they were once,
but that time seems to have passed.

Regarding the Fixes tag. I wonder if this one, which post-dates the
currently cited commit is correct. It seems to be when RT variants of
these locks was introduced.

Fixes: 342a93247e08 ("locking/spinlock: Provide RT variant header: <linux/spinlock_rt.h>")

Lastly, for reference, when posting fixes for Networking code, please:

* Target the patches at net like this:

  [PATCH net] ...

* Allow at least 24h to pass before posting updated patch versions

More can be found here: https://docs.kernel.org/process/maintainer-netdev.html


Reviewed-by: Simon Horman <horms@kernel.org>
Re: [PATCH v2] smc91x: fix broken irq-context in PREEMPT_RT
Posted by Yeoreum Yun 1 month, 3 weeks ago
Hi Simon,

> On Fri, Dec 12, 2025 at 07:03:38PM +0000, Yeoreum Yun wrote:
> > When smc91x.c is built with PREEMPT_RT, the following splat occurs
> > in FVP_RevC:
> >
> > [   13.055000] smc91x LNRO0003:00 eth0: link up, 10Mbps, half-duplex, lpa 0x0000
> > [   13.062137] BUG: workqueue leaked atomic, lock or RCU: kworker/2:1[106]
> > [   13.062137]      preempt=0x00000000 lock=0->0 RCU=0->1 workfn=mld_ifc_work
> > [   13.062266] C
> > ** replaying previous printk message **
> > [   13.062266] CPU: 2 UID: 0 PID: 106 Comm: kworker/2:1 Not tainted 6.18.0-dirty #179 PREEMPT_{RT,(full)}
> > [   13.062353] Hardware name:  , BIOS
> > [   13.062382] Workqueue: mld mld_ifc_work
> > [   13.062469] Call trace:
> > [   13.062494]  show_stack+0x24/0x40 (C)
> > [   13.062602]  __dump_stack+0x28/0x48
> > [   13.062710]  dump_stack_lvl+0x7c/0xb0
> > [   13.062818]  dump_stack+0x18/0x34
> > [   13.062926]  process_scheduled_works+0x294/0x450
> > [   13.063043]  worker_thread+0x260/0x3d8
> > [   13.063124]  kthread+0x1c4/0x228
> > [   13.063235]  ret_from_fork+0x10/0x20
> >
> > This happens because smc_special_trylock() disables IRQs even on PREEMPT_RT,
> > but smc_special_unlock() does not restore IRQs on PREEMPT_RT.
> > The reason is that smc_special_unlock() calls spin_unlock_irqrestore(),
> > and rcu_read_unlock_bh() in __dev_queue_xmit() cannot invoke
> > rcu_read_unlock() through __local_bh_enable_ip() when current->softirq_disable_cnt becomes zero.
> >
> > To address this issue, replace smc_special_trylock() with spin_trylock_irqsave().
> >
> > Fixes: 8ff499e43c53 ("smc91x: let smc91x work well under netpoll")
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> > This patch based on v6.18.
> >
> > History
> > ========
> >
> > >From v1 to v2:
> >   - remove debug log.
> >   - https://lore.kernel.org/all/20251212185818.2209573-1-yeoreum.yun@arm.com/
> >
>
> Firstly, I'd like to note that it seems to me that the last
> non-trivial update to this driver seems to have occurred back in 2016.
> Do you know if it is still actively used?

Unfortunately, I don't know whether it is still actively used in real.
AFAIK it could be used in qemu or arm FVP...

>
> I agree that this patch seems appropriate as a bug fix.
> But I do wonder if, as a follow-up for net-next when it re-opens,
> smc_special_*lock could be removed entirely.
> Other than being the source of this bug (which I guess is special),
> they don't seem very special anymore. Perhaps they were once,
> but that time seems to have passed.

IIUC, remove smc_special*_lock() and replace them with "spin_*lock_*" right?
If so, It seems good.
Even UP case, I think it seems better to protect
critical section in smc_hardware_send_pkt() and smc_hard_start_xmit()
with spin_lock_irqsave() from interrupt handler.
>
> Regarding the Fixes tag. I wonder if this one, which post-dates the
> currently cited commit is correct. It seems to be when RT variants of
> these locks was introduced.
>
> Fixes: 342a93247e08 ("locking/spinlock: Provide RT variant header: <linux/spinlock_rt.h>")

Yes that's would be better.

>
> Lastly, for reference, when posting fixes for Networking code, please:
>
> * Target the patches at net like this:
>
>   [PATCH net] ...

Thanks to let me know for this.

>
> * Allow at least 24h to pass before posting updated patch versions
>
> More can be found here: https://docs.kernel.org/process/maintainer-netdev.html
>
>
> Reviewed-by: Simon Horman <horms@kernel.org>

Thanks for review :).
In next patch, I'll reformat the title,
change the fix tag and includes your R-b tag.

--
Sincerely,
Yeoreum Yun