[PATCH RFC v1 0/6] rcu: fix stuck defer_qs_pending state

Joel Fernandes posted 6 patches 2 days, 4 hours ago
kernel/rcu/Kconfig.debug | 11 +++++++++
kernel/rcu/tree.c        | 49 ++++++++++++++++++++++++++++++++++++++++
kernel/rcu/tree.h        | 13 +++++++++++
kernel/rcu/tree_exp.h    |  6 +++++
kernel/rcu/tree_plugin.h | 30 ++++++++++++------------
5 files changed, 94 insertions(+), 15 deletions(-)
[PATCH RFC v1 0/6] rcu: fix stuck defer_qs_pending state
Posted by Joel Fernandes 2 days, 4 hours ago
This series fixes a bug where rdp->defer_qs_pending can remain stuck in
PENDING when a preempted reader's quiescent state is reported up-tree via
a path other than the deferred-QS irq-work handler (FQS scan, hotplug
transition, expedited GP IPI, context switch). Once stuck, the pending
gate in rcu_read_unlock_special() silently suppresses all future arming
attempts on that CPU. The series adds PENDING -> IDLE transitions at the
missing sites.

Also handles the case where the deferred-QS irq-work handler may run between
segments of a compound section (per Paul McKenney's counter-example):

I also handled expedited GP cases in a patch. I have not yet looked at how this
interacts with the softirq paths so I am keeping the RFC tag on.

The last patch is a debug-only detector (CONFIG_RCU_GP_CLEANUP_STALE_CHECK,
marked [TEST COMMIT], not for merge) -- applied alone on unmodified mainline
without patches 2-5 it reliably fires a WARN within 5 minutes under TREE03
rcutorture, confirming the bug exists and the detector catches it; with the
full fix applied, I could not reproduce the issue.

The git tree with all patches can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/tag/?h=rcu-dqs-stuck-rfc-v1-20260522

Joel Fernandes (6):
  rcu: introduce rcu_defer_qs_clear() helper
  rcu: clear defer_qs_pending when notifying GP changes
  rcu: clear defer_qs_pending in handler for compounded sections
  rcu: drop redundant defer_qs_pending clear in irqrestore handler
  rcu: clear defer_qs_pending at expedited IPI entry
  [TEST COMMIT] rcu: detect stuck defer_qs_pending at GP cleanup

 kernel/rcu/Kconfig.debug | 11 +++++++++
 kernel/rcu/tree.c        | 49 ++++++++++++++++++++++++++++++++++++++++
 kernel/rcu/tree.h        | 13 +++++++++++
 kernel/rcu/tree_exp.h    |  6 +++++
 kernel/rcu/tree_plugin.h | 30 ++++++++++++------------
 5 files changed, 94 insertions(+), 15 deletions(-)

-- 
2.34.1