Currently, rcutorture bypasses lazy RCU by using call_rcu_hurry().
This works, avoiding the dreaded rtort_pipe_count WARN(), but fails to
fully test lazy RCU. The rtort_pipe_count WARN() splats because lazy RCU
could delay the start of an RCU grace period for a full stutter period,
which defaults to only three seconds.
This commit therefore reverts the call_rcu_hurry() instances
back to call_rcu(), but, in kernels built with CONFIG_RCU_LAZY=y,
queues a workqueue handler just before the call to stutter_wait() in
rcu_torture_writer(). This workqueue handler invokes rcu_barrier(),
which motivates any lingering lazy callbacks, thus avoiding the splat.
Questions for review:
1. Should we avoid queueing work for RCU implementations not
supporting lazy callbacks?
2. Should we avoid queueing work in kernels built with
CONFIG_RCU_LAZY=y, but that were not booted with the
rcutree.enable_rcu_lazy kernel boot parameter set? (Note that
this requires some ugliness to access this parameter, and must
also handle Tiny RCU.)
3. Does the rcu_torture_ops structure need a ->call_hurry() field,
and if so, why? If not, why not?
4. Your additional questions here!
Reported-by: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index c7958a3f8d673..270d551d66482 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -572,7 +572,7 @@ static unsigned long rcu_no_completed(void)
static void rcu_torture_deferred_free(struct rcu_torture *p)
{
- call_rcu_hurry(&p->rtort_rcu, rcu_torture_cb);
+ call_rcu(&p->rtort_rcu, rcu_torture_cb);
}
static void rcu_sync_torture_init(void)
@@ -619,7 +619,7 @@ static struct rcu_torture_ops rcu_ops = {
.poll_gp_state_exp = poll_state_synchronize_rcu,
.cond_sync_exp = cond_synchronize_rcu_expedited,
.cond_sync_exp_full = cond_synchronize_rcu_expedited_full,
- .call = call_rcu_hurry,
+ .call = call_rcu,
.cb_barrier = rcu_barrier,
.fqs = rcu_force_quiescent_state,
.gp_kthread_dbg = show_rcu_gp_kthreads,
@@ -1138,7 +1138,7 @@ static void rcu_tasks_torture_deferred_free(struct rcu_torture *p)
static void synchronize_rcu_mult_test(void)
{
- synchronize_rcu_mult(call_rcu_tasks, call_rcu_hurry);
+ synchronize_rcu_mult(call_rcu_tasks, call_rcu);
}
static struct rcu_torture_ops tasks_ops = {
@@ -1624,6 +1624,17 @@ static void do_rtws_sync(struct torture_random_state *trsp, void (*sync)(void))
cpus_read_unlock();
}
+/*
+ * Do an rcu_barrier() to motivate lazy callbacks during a stutter
+ * pause. Without this, we can get false-positives rtort_pipe_count
+ * splats.
+ */
+static void rcu_torture_writer_work(struct work_struct *work)
+{
+ if (cur_ops->cb_barrier)
+ cur_ops->cb_barrier();
+}
+
/*
* RCU torture writer kthread. Repeatedly substitutes a new structure
* for that pointed to by rcu_torture_current, freeing the old structure
@@ -1644,6 +1655,7 @@ rcu_torture_writer(void *arg)
int i;
int idx;
unsigned long j;
+ struct work_struct lazy_work;
int oldnice = task_nice(current);
struct rcu_gp_oldstate *rgo = NULL;
int rgo_size = 0;
@@ -1660,6 +1672,7 @@ rcu_torture_writer(void *arg)
stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) *
HZ * (stall_cpu_repeat + 1);
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
+ INIT_WORK_ONSTACK(&lazy_work, rcu_torture_writer_work);
if (!can_expedite)
pr_alert("%s" TORTURE_FLAG
" GP expediting controlled from boot/sysfs for %s.\n",
@@ -1888,6 +1901,8 @@ rcu_torture_writer(void *arg)
!rcu_gp_is_normal();
}
rcu_torture_writer_state = RTWS_STUTTER;
+ if (IS_ENABLED(CONFIG_RCU_LAZY))
+ queue_work(system_percpu_wq, &lazy_work);
stutter_waited = stutter_wait("rcu_torture_writer");
if (stutter_waited &&
!atomic_read(&rcu_fwd_cb_nodelay) &&
> On Feb 27, 2026, at 7:39 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > Currently, rcutorture bypasses lazy RCU by using call_rcu_hurry(). > This works, avoiding the dreaded rtort_pipe_count WARN(), but fails to > fully test lazy RCU. The rtort_pipe_count WARN() splats because lazy RCU > could delay the start of an RCU grace period for a full stutter period, > which defaults to only three seconds. But call_rcu_hurry() should completely flush, so if we are still splatting with it, does that not mean that there is a real bug the splat uncovered? > This commit therefore reverts the call_rcu_hurry() instances > back to call_rcu(), but, in kernels built with CONFIG_RCU_LAZY=y, > queues a workqueue handler just before the call to stutter_wait() in > rcu_torture_writer(). This workqueue handler invokes rcu_barrier(), > which motivates any lingering lazy callbacks, thus avoiding the splat. But nothing should be lingering with _hurry(). > Questions for review: > > 1. Should we avoid queueing work for RCU implementations not > supporting lazy callbacks? I think so, best to isolate non Lazy cases so the barrier call does not cause side effects. > 2. Should we avoid queueing work in kernels built with > CONFIG_RCU_LAZY=y, but that were not booted with the > rcutree.enable_rcu_lazy kernel boot parameter set? (Note that > this requires some ugliness to access this parameter, and must > also handle Tiny RCU.) Yes I think we should avoid. > 3. Does the rcu_torture_ops structure need a ->call_hurry() field, > and if so, why? If not, why not? > > 4. Your additional questions here! Do we have a reproducer for the splat? If there is a link to the report, I could take a look and investigate. thanks, -- Joel Fernandes
On Sat, Feb 28, 2026 at 10:09:11PM -0500, Joel Fernandes wrote: > > On Feb 27, 2026, at 7:39 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > Currently, rcutorture bypasses lazy RCU by using call_rcu_hurry(). > > This works, avoiding the dreaded rtort_pipe_count WARN(), but fails to > > fully test lazy RCU. The rtort_pipe_count WARN() splats because lazy RCU > > could delay the start of an RCU grace period for a full stutter period, > > which defaults to only three seconds. > > But call_rcu_hurry() should completely flush, so if we are still > splatting with it, does that not mean that there is a real bug the > splat uncovered? He was running an old rcutorture without call_rcu_hurry(), which did splat. This got me thinking about how to test better. This patch, right or wrong, is what came to mind. > > This commit therefore reverts the call_rcu_hurry() instances > > back to call_rcu(), but, in kernels built with CONFIG_RCU_LAZY=y, > > queues a workqueue handler just before the call to stutter_wait() in > > rcu_torture_writer(). This workqueue handler invokes rcu_barrier(), > > which motivates any lingering lazy callbacks, thus avoiding the splat. > > But nothing should be lingering with _hurry(). True, but this patch removes the _hurry(). > > Questions for review: > > > > 1. Should we avoid queueing work for RCU implementations not > > supporting lazy callbacks? > > I think so, best to isolate non Lazy cases so the barrier call does > not cause side effects. OK, done with IS_ENABLED(). > > 2. Should we avoid queueing work in kernels built with > > CONFIG_RCU_LAZY=y, but that were not booted with the > > rcutree.enable_rcu_lazy kernel boot parameter set? (Note that > > this requires some ugliness to access this parameter, and must > > also handle Tiny RCU.) > > Yes I think we should avoid. That will require an accessor function, but easy enough to do. I will update. > > 3. Does the rcu_torture_ops structure need a ->call_hurry() field, > > and if so, why? If not, why not? > > > > 4. Your additional questions here! > > Do we have a reproducer for the splat? If there is a link to the > report, I could take a look and investigate. The splat is on an old kernel where rcutorture does not yet have call_rcu_hurry(). So this patch isn't fixing a bug, but rather allegedly improving lazy RCU rcutorture testing. Thanx, Paul
[...] >>> 3. Does the rcu_torture_ops structure need a ->call_hurry() field, >>> and if so, why? If not, why not? >>> >>> 4. Your additional questions here! >> >> Do we have a reproducer for the splat? If there is a link to the >> report, I could take a look and investigate. > > The splat is on an old kernel where rcutorture does not yet have > call_rcu_hurry(). So this patch isn't fixing a bug, but rather allegedly > improving lazy RCU rcutorture testing. Makes sense! I take it that on a newer kernel, we do see a splat with just replacing _hurry() with the alternative, but without any other changes. But either way, this is a nice change, so thanks! -- Joel Fernandes
© 2016 - 2026 Red Hat, Inc.