[PATCH RFC] rcutorture: Fully test lazy RCU

Paul E. McKenney posted 1 patch 1 month, 2 weeks ago
[PATCH RFC] rcutorture: Fully test lazy RCU
Posted by Paul E. McKenney 1 month, 2 weeks ago
Currently, rcutorture bypasses lazy RCU by using call_rcu_hurry().
This works, avoiding the dreaded rtort_pipe_count WARN(), but fails to
fully test lazy RCU.  The rtort_pipe_count WARN() splats because lazy RCU
could delay the start of an RCU grace period for a full stutter period,
which defaults to only three seconds.

This commit therefore reverts the call_rcu_hurry() instances
back to call_rcu(), but, in kernels built with CONFIG_RCU_LAZY=y,
queues a workqueue handler just before the call to stutter_wait() in
rcu_torture_writer().  This workqueue handler invokes rcu_barrier(),
which motivates any lingering lazy callbacks, thus avoiding the splat.

Questions for review:

1.	Should we avoid queueing work for RCU implementations not
	supporting lazy callbacks?

2.	Should we avoid queueing work in kernels built with
	CONFIG_RCU_LAZY=y, but that were not booted with the
	rcutree.enable_rcu_lazy kernel boot parameter set?  (Note that
	this requires some ugliness to access this parameter, and must
	also handle Tiny RCU.)

3.	Does the rcu_torture_ops structure need a ->call_hurry() field,
	and if so, why?  If not, why not?

4.	Your additional questions here!

Reported-by: Saravana Kannan <saravanak@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index c7958a3f8d673..270d551d66482 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -572,7 +572,7 @@ static unsigned long rcu_no_completed(void)
 
 static void rcu_torture_deferred_free(struct rcu_torture *p)
 {
-	call_rcu_hurry(&p->rtort_rcu, rcu_torture_cb);
+	call_rcu(&p->rtort_rcu, rcu_torture_cb);
 }
 
 static void rcu_sync_torture_init(void)
@@ -619,7 +619,7 @@ static struct rcu_torture_ops rcu_ops = {
 	.poll_gp_state_exp	= poll_state_synchronize_rcu,
 	.cond_sync_exp		= cond_synchronize_rcu_expedited,
 	.cond_sync_exp_full	= cond_synchronize_rcu_expedited_full,
-	.call			= call_rcu_hurry,
+	.call			= call_rcu,
 	.cb_barrier		= rcu_barrier,
 	.fqs			= rcu_force_quiescent_state,
 	.gp_kthread_dbg		= show_rcu_gp_kthreads,
@@ -1138,7 +1138,7 @@ static void rcu_tasks_torture_deferred_free(struct rcu_torture *p)
 
 static void synchronize_rcu_mult_test(void)
 {
-	synchronize_rcu_mult(call_rcu_tasks, call_rcu_hurry);
+	synchronize_rcu_mult(call_rcu_tasks, call_rcu);
 }
 
 static struct rcu_torture_ops tasks_ops = {
@@ -1624,6 +1624,17 @@ static void do_rtws_sync(struct torture_random_state *trsp, void (*sync)(void))
 		cpus_read_unlock();
 }
 
+/*
+ * Do an rcu_barrier() to motivate lazy callbacks during a stutter
+ * pause.  Without this, we can get false-positives rtort_pipe_count
+ * splats.
+ */
+static void rcu_torture_writer_work(struct work_struct *work)
+{
+	if (cur_ops->cb_barrier)
+		cur_ops->cb_barrier();
+}
+
 /*
  * RCU torture writer kthread.  Repeatedly substitutes a new structure
  * for that pointed to by rcu_torture_current, freeing the old structure
@@ -1644,6 +1655,7 @@ rcu_torture_writer(void *arg)
 	int i;
 	int idx;
 	unsigned long j;
+	struct work_struct lazy_work;
 	int oldnice = task_nice(current);
 	struct rcu_gp_oldstate *rgo = NULL;
 	int rgo_size = 0;
@@ -1660,6 +1672,7 @@ rcu_torture_writer(void *arg)
 		stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) *
 			      HZ * (stall_cpu_repeat + 1);
 	VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
+	INIT_WORK_ONSTACK(&lazy_work, rcu_torture_writer_work);
 	if (!can_expedite)
 		pr_alert("%s" TORTURE_FLAG
 			 " GP expediting controlled from boot/sysfs for %s.\n",
@@ -1888,6 +1901,8 @@ rcu_torture_writer(void *arg)
 				       !rcu_gp_is_normal();
 		}
 		rcu_torture_writer_state = RTWS_STUTTER;
+		if (IS_ENABLED(CONFIG_RCU_LAZY))
+			queue_work(system_percpu_wq, &lazy_work);
 		stutter_waited = stutter_wait("rcu_torture_writer");
 		if (stutter_waited &&
 		    !atomic_read(&rcu_fwd_cb_nodelay) &&
Re: [PATCH RFC] rcutorture: Fully test lazy RCU
Posted by Joel Fernandes 1 month, 2 weeks ago
> On Feb 27, 2026, at 7:39 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> Currently, rcutorture bypasses lazy RCU by using call_rcu_hurry().
> This works, avoiding the dreaded rtort_pipe_count WARN(), but fails to
> fully test lazy RCU.  The rtort_pipe_count WARN() splats because lazy RCU
> could delay the start of an RCU grace period for a full stutter period,
> which defaults to only three seconds.

But call_rcu_hurry() should completely flush, so if we are still
splatting with it, does that not mean that there is a real bug the
splat uncovered?

> This commit therefore reverts the call_rcu_hurry() instances
> back to call_rcu(), but, in kernels built with CONFIG_RCU_LAZY=y,
> queues a workqueue handler just before the call to stutter_wait() in
> rcu_torture_writer().  This workqueue handler invokes rcu_barrier(),
> which motivates any lingering lazy callbacks, thus avoiding the splat.

But nothing should be lingering with _hurry().

> Questions for review:
>
> 1.    Should we avoid queueing work for RCU implementations not
>    supporting lazy callbacks?

I think so, best to isolate non Lazy cases so the barrier call does
not cause side effects.

> 2.    Should we avoid queueing work in kernels built with
>    CONFIG_RCU_LAZY=y, but that were not booted with the
>    rcutree.enable_rcu_lazy kernel boot parameter set?  (Note that
>    this requires some ugliness to access this parameter, and must
>    also handle Tiny RCU.)

Yes I think we should avoid.

> 3.    Does the rcu_torture_ops structure need a ->call_hurry() field,
>    and if so, why?  If not, why not?
>
> 4.    Your additional questions here!

Do we have a reproducer for the splat? If there is a link to the
report, I could take a look and investigate.

thanks,

--
Joel Fernandes
Re: [PATCH RFC] rcutorture: Fully test lazy RCU
Posted by Paul E. McKenney 1 month, 2 weeks ago
On Sat, Feb 28, 2026 at 10:09:11PM -0500, Joel Fernandes wrote:
> > On Feb 27, 2026, at 7:39 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > Currently, rcutorture bypasses lazy RCU by using call_rcu_hurry().
> > This works, avoiding the dreaded rtort_pipe_count WARN(), but fails to
> > fully test lazy RCU.  The rtort_pipe_count WARN() splats because lazy RCU
> > could delay the start of an RCU grace period for a full stutter period,
> > which defaults to only three seconds.
> 
> But call_rcu_hurry() should completely flush, so if we are still
> splatting with it, does that not mean that there is a real bug the
> splat uncovered?

He was running an old rcutorture without call_rcu_hurry(), which did
splat.  This got me thinking about how to test better.  This patch,
right or wrong, is what came to mind.

> > This commit therefore reverts the call_rcu_hurry() instances
> > back to call_rcu(), but, in kernels built with CONFIG_RCU_LAZY=y,
> > queues a workqueue handler just before the call to stutter_wait() in
> > rcu_torture_writer().  This workqueue handler invokes rcu_barrier(),
> > which motivates any lingering lazy callbacks, thus avoiding the splat.
> 
> But nothing should be lingering with _hurry().

True, but this patch removes the _hurry().

> > Questions for review:
> >
> > 1.    Should we avoid queueing work for RCU implementations not
> >    supporting lazy callbacks?
> 
> I think so, best to isolate non Lazy cases so the barrier call does
> not cause side effects.

OK, done with IS_ENABLED().

> > 2.    Should we avoid queueing work in kernels built with
> >    CONFIG_RCU_LAZY=y, but that were not booted with the
> >    rcutree.enable_rcu_lazy kernel boot parameter set?  (Note that
> >    this requires some ugliness to access this parameter, and must
> >    also handle Tiny RCU.)
> 
> Yes I think we should avoid.

That will require an accessor function, but easy enough to do.  I will
update.

> > 3.    Does the rcu_torture_ops structure need a ->call_hurry() field,
> >    and if so, why?  If not, why not?
> >
> > 4.    Your additional questions here!
> 
> Do we have a reproducer for the splat? If there is a link to the
> report, I could take a look and investigate.

The splat is on an old kernel where rcutorture does not yet have
call_rcu_hurry().  So this patch isn't fixing a bug, but rather allegedly
improving lazy RCU rcutorture testing.

							Thanx, Paul
Re: [PATCH RFC] rcutorture: Fully test lazy RCU
Posted by Joel Fernandes 1 month, 2 weeks ago
[...]

>>> 3.    Does the rcu_torture_ops structure need a ->call_hurry() field,
>>>   and if so, why?  If not, why not?
>>>
>>> 4.    Your additional questions here!
>>
>> Do we have a reproducer for the splat? If there is a link to the
>> report, I could take a look and investigate.
>
> The splat is on an old kernel where rcutorture does not yet have
> call_rcu_hurry().  So this patch isn't fixing a bug, but rather allegedly
> improving lazy RCU rcutorture testing.

Makes sense! I take it that on a newer kernel, we do see a splat with
just replacing _hurry() with the alternative, but without any other
changes. But either way, this is a nice change, so thanks!

-- 
Joel Fernandes