[PATCH v4 1/2] rcu: Make expedited RCU CPU stall warnings detect stall-end races

Paul E. McKenney posted 2 patches 1 month, 1 week ago
[PATCH v4 1/2] rcu: Make expedited RCU CPU stall warnings detect stall-end races
Posted by Paul E. McKenney 1 month, 1 week ago
If an expedited RCU CPU stall ends just at the stall-warning timeout,
the current code will print an expedited stall-warning message, but one
that doesn't identify any CPUs or tasks causing the stall.  This is most
likely to happen for short-timeout stalls, for example, the 20-millisecond
timeouts that are sometimes used for small embedded devices.  Needless to
say, these semi-empty stall-warning messages can be rather confusing.

One option would be to suppress the stall-warning message entirely in
this case, but the near-miss information can be quite valuable.

This commit therefore detects this race condition and emits a "INFO:
Expedited stall ended before state dump start" message to clarify matters.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tree_exp.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 96c49c56fc14a..82cada459e5d0 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -589,7 +589,12 @@ static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigne
 	pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
 		j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
 		".T"[!!data_race(rnp_root->exp_tasks)]);
-	if (ndetected) {
+	if (!ndetected) {
+		// This is invoked from the grace-period worker, so
+		// a new grace period cannot have started.  And if this
+		// worker were stalled, we would not get here.  ;-)
+		pr_err("INFO: Expedited stall ended before state dump start\n");
+	} else {
 		pr_err("blocking rcu_node structures (internal RCU debug):");
 		rcu_for_each_node_breadth_first(rnp) {
 			if (rnp == rnp_root)
-- 
2.40.1
Re: [PATCH v4 1/2] rcu: Make expedited RCU CPU stall warnings detect stall-end races
Posted by Borislav Petkov 1 month, 1 week ago
On Mon, Dec 29, 2025 at 11:16:15AM -0800, Paul E. McKenney wrote:
> If an expedited RCU CPU stall ends just at the stall-warning timeout,
> the current code will print an expedited stall-warning message, but one
> that doesn't identify any CPUs or tasks causing the stall.  This is most
> likely to happen for short-timeout stalls, for example, the 20-millisecond
> timeouts that are sometimes used for small embedded devices.  Needless to
> say, these semi-empty stall-warning messages can be rather confusing.
> 
> One option would be to suppress the stall-warning message entirely in
> this case, but the near-miss information can be quite valuable.
> 
> This commit therefore detects this race condition and emits a "INFO:

s/This commit therefore detects this/Detect this/

> Expedited stall ended before state dump start" message to clarify matters.
> 
> Reported-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> ---
>  kernel/rcu/tree_exp.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

But yeah, makes sense.

Acked-by: Borislav Petkov (AMD) <bp@alien8.de>

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette