[v2] rcu/exp updates

[PATCH 4/5] rcu/exp: Warn on QS requested on dying CPU

Posted by Frederic Weisbecker 11 months ago

It is not possible to send an IPI to a dying CPU that has passed the
CPUHP_TEARDOWN_CPU stage. Remaining unhandled IPIs are handled later at
CPUHP_AP_SMPCFD_DYING stage by stop machine. This is the last
opportunity for RCU exp handler to request an expedited quiescent state.
And the upcoming final context switch between stop machine and idle must
have reported the requested context switch.

Therefore, it should not be possible to observe a pending requested
expedited quiescent state when RCU finally stops watching the outgoing
CPU. Once IPIs aren't possible anymore, the QS for the target CPU will
be reported on its behalf by the RCU exp kworker.

Provide an assertion to verify those expectations.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 3fe68057d8b4..79dced5fb72e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4321,6 +4321,12 @@ void rcutree_report_cpu_dead(void)
 	 * may introduce a new READ-side while it is actually off the QS masks.
 	 */
 	lockdep_assert_irqs_disabled();
+	/*
+	 * CPUHP_AP_SMPCFD_DYING was the last call for rcu_exp_handler() execution.
+	 * The requested QS must have been reported on the last context switch
+	 * from stop machine to idle.
+	 */
+	WARN_ON_ONCE(rdp->cpu_no_qs.b.exp);
 	// Do any dangling deferred wakeups.
 	do_nocb_deferred_wakeup(rdp);
 
-- 
2.48.1

Re: [PATCH 4/5] rcu/exp: Warn on QS requested on dying CPU

Posted by Paul E. McKenney 10 months, 3 weeks ago

On Fri, Mar 14, 2025 at 03:36:41PM +0100, Frederic Weisbecker wrote:
> It is not possible to send an IPI to a dying CPU that has passed the
> CPUHP_TEARDOWN_CPU stage. Remaining unhandled IPIs are handled later at
> CPUHP_AP_SMPCFD_DYING stage by stop machine. This is the last
> opportunity for RCU exp handler to request an expedited quiescent state.
> And the upcoming final context switch between stop machine and idle must
> have reported the requested context switch.
> 
> Therefore, it should not be possible to observe a pending requested
> expedited quiescent state when RCU finally stops watching the outgoing
> CPU. Once IPIs aren't possible anymore, the QS for the target CPU will
> be reported on its behalf by the RCU exp kworker.
> 
> Provide an assertion to verify those expectations.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

But what do we do if this assertion triggers?  And do we want it to take
effect only in kernels built with CONFIG_PROVE_RCU?  Or is such a broken
assumption bad enough to justify a splat in production kernels?

If the answer to the last question is "yes" (and you, not me, work for
a distro, so it is your question to answer):

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

							Thanx, Paul

> ---
>  kernel/rcu/tree.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 3fe68057d8b4..79dced5fb72e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4321,6 +4321,12 @@ void rcutree_report_cpu_dead(void)
>  	 * may introduce a new READ-side while it is actually off the QS masks.
>  	 */
>  	lockdep_assert_irqs_disabled();
> +	/*
> +	 * CPUHP_AP_SMPCFD_DYING was the last call for rcu_exp_handler() execution.
> +	 * The requested QS must have been reported on the last context switch
> +	 * from stop machine to idle.
> +	 */
> +	WARN_ON_ONCE(rdp->cpu_no_qs.b.exp);
>  	// Do any dangling deferred wakeups.
>  	do_nocb_deferred_wakeup(rdp);
>  
> -- 
> 2.48.1
>

Re: [PATCH 4/5] rcu/exp: Warn on QS requested on dying CPU

Posted by Frederic Weisbecker 10 months, 3 weeks ago

Le Tue, Mar 18, 2025 at 10:21:48AM -0700, Paul E. McKenney a écrit :
> On Fri, Mar 14, 2025 at 03:36:41PM +0100, Frederic Weisbecker wrote:
> > It is not possible to send an IPI to a dying CPU that has passed the
> > CPUHP_TEARDOWN_CPU stage. Remaining unhandled IPIs are handled later at
> > CPUHP_AP_SMPCFD_DYING stage by stop machine. This is the last
> > opportunity for RCU exp handler to request an expedited quiescent state.
> > And the upcoming final context switch between stop machine and idle must
> > have reported the requested context switch.
> > 
> > Therefore, it should not be possible to observe a pending requested
> > expedited quiescent state when RCU finally stops watching the outgoing
> > CPU. Once IPIs aren't possible anymore, the QS for the target CPU will
> > be reported on its behalf by the RCU exp kworker.
> > 
> > Provide an assertion to verify those expectations.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> 
> But what do we do if this assertion triggers?

It means there is likely something to fix because an IPI has been sent
and somehow the CPU missed it.

> And do we want it to take
> effect only in kernels built with CONFIG_PROVE_RCU?  Or is such a broken
> assumption bad enough to justify a splat in production kernels?
> 
> If the answer to the last question is "yes" (and you, not me, work for
> a distro, so it is your question to answer):

I think it's bad enough to deserve a real warning. Also this is a slow path.

> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

Thanks!

> 
> 							Thanx, Paul
> 
> > ---
> >  kernel/rcu/tree.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 3fe68057d8b4..79dced5fb72e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -4321,6 +4321,12 @@ void rcutree_report_cpu_dead(void)
> >  	 * may introduce a new READ-side while it is actually off the QS masks.
> >  	 */
> >  	lockdep_assert_irqs_disabled();
> > +	/*
> > +	 * CPUHP_AP_SMPCFD_DYING was the last call for rcu_exp_handler() execution.
> > +	 * The requested QS must have been reported on the last context switch
> > +	 * from stop machine to idle.
> > +	 */
> > +	WARN_ON_ONCE(rdp->cpu_no_qs.b.exp);
> >  	// Do any dangling deferred wakeups.
> >  	do_nocb_deferred_wakeup(rdp);
> >  
> > -- 
> > 2.48.1
> >

[PATCH 1/5] rcu/exp: Protect against early QS report
[PATCH 2/5] rcu/exp: Remove confusing needless full barrier on task unblock
[PATCH 3/5] rcu/exp: Remove needless CPU up quiescent state report
[PATCH 4/5] rcu/exp: Warn on QS requested on dying CPU
[PATCH 5/5] rcu/exp: Warn on CPU lagging for too long within hotplug IPI's blindspot