[PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning

Paul E. McKenney posted 1 patch 1 year ago
[PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Paul E. McKenney 1 year ago
The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
This can (and does) result in extraneous lockdep warnings when this
WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
of helpful.

Therefore, invert the WARN_ON_ONCE() condition and move it before the
call to ct_state_inc().  This does mean that the ct_state_inc() return
value can no longer be used in the WARN_ON_ONCE() condition, so discard
this return value and instead use a call to rcu_is_watching_curr_cpu().
This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
is no added overhead in production use.

Reported-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <vschneid@redhat.com>

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 938c48952d26..fb5be6e9b423 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
  */
 static noinstr void ct_kernel_exit_state(int offset)
 {
-	int seq;
-
 	/*
 	 * CPUs seeing atomic_add_return() must see prior RCU read-side
 	 * critical sections, and we also must force ordering with the
 	 * next idle sojourn.
 	 */
 	rcu_task_trace_heavyweight_enter();  // Before CT state update!
-	seq = ct_state_inc(offset);
-	// RCU is no longer watching.  Better be in extended quiescent state!
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
+	// RCU is still watching.  Better not be in extended quiescent state!
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
+	(void)ct_state_inc(offset);
+	// RCU is no longer watching.
 }
 
 /*
Re: [PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Frederic Weisbecker 1 year ago
Le Sat, Feb 01, 2025 at 10:44:02AM -0800, Paul E. McKenney a écrit :
> The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
> ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
> This can (and does) result in extraneous lockdep warnings when this
> WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
> of helpful.
> 
> Therefore, invert the WARN_ON_ONCE() condition and move it before the
> call to ct_state_inc().  This does mean that the ct_state_inc() return
> value can no longer be used in the WARN_ON_ONCE() condition, so discard
> this return value and instead use a call to rcu_is_watching_curr_cpu().
> This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
> is no added overhead in production use.
> 
> Reported-by: Breno Leitao <leitao@debian.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Valentin Schneider <vschneid@redhat.com>

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>


> 
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 938c48952d26..fb5be6e9b423 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
>   */
>  static noinstr void ct_kernel_exit_state(int offset)
>  {
> -	int seq;
> -
>  	/*
>  	 * CPUs seeing atomic_add_return() must see prior RCU read-side
>  	 * critical sections, and we also must force ordering with the
>  	 * next idle sojourn.
>  	 */
>  	rcu_task_trace_heavyweight_enter();  // Before CT state update!
> -	seq = ct_state_inc(offset);
> -	// RCU is no longer watching.  Better be in extended quiescent state!
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> +	// RCU is still watching.  Better not be in extended quiescent state!
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
> +	(void)ct_state_inc(offset);
> +	// RCU is no longer watching.
>  }
>  
>  /*
Re: [PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Paul E. McKenney 1 year ago
On Wed, Feb 05, 2025 at 06:04:41PM +0100, Frederic Weisbecker wrote:
> Le Sat, Feb 01, 2025 at 10:44:02AM -0800, Paul E. McKenney a écrit :
> > The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
> > ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
> > This can (and does) result in extraneous lockdep warnings when this
> > WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
> > of helpful.
> > 
> > Therefore, invert the WARN_ON_ONCE() condition and move it before the
> > call to ct_state_inc().  This does mean that the ct_state_inc() return
> > value can no longer be used in the WARN_ON_ONCE() condition, so discard
> > this return value and instead use a call to rcu_is_watching_curr_cpu().
> > This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
> > is no added overhead in production use.
> > 
> > Reported-by: Breno Leitao <leitao@debian.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Valentin Schneider <vschneid@redhat.com>
> 
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thank you!  I will apply this on my next rebase.

							Thanx, Paul

> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index 938c48952d26..fb5be6e9b423 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
> >   */
> >  static noinstr void ct_kernel_exit_state(int offset)
> >  {
> > -	int seq;
> > -
> >  	/*
> >  	 * CPUs seeing atomic_add_return() must see prior RCU read-side
> >  	 * critical sections, and we also must force ordering with the
> >  	 * next idle sojourn.
> >  	 */
> >  	rcu_task_trace_heavyweight_enter();  // Before CT state update!
> > -	seq = ct_state_inc(offset);
> > -	// RCU is no longer watching.  Better be in extended quiescent state!
> > -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> > +	// RCU is still watching.  Better not be in extended quiescent state!
> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
> > +	(void)ct_state_inc(offset);
> > +	// RCU is no longer watching.
> >  }
> >  
> >  /*
Re: [PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Valentin Schneider 1 year ago
On 01/02/25 10:44, Paul E. McKenney wrote:
> The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
> ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
> This can (and does) result in extraneous lockdep warnings when this
> WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
> of helpful.
>
> Therefore, invert the WARN_ON_ONCE() condition and move it before the
> call to ct_state_inc().  This does mean that the ct_state_inc() return
> value can no longer be used in the WARN_ON_ONCE() condition, so discard
> this return value and instead use a call to rcu_is_watching_curr_cpu().
> This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
> is no added overhead in production use.
>
> Reported-by: Breno Leitao <leitao@debian.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Valentin Schneider <vschneid@redhat.com>
>
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 938c48952d26..fb5be6e9b423 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
>   */
>  static noinstr void ct_kernel_exit_state(int offset)
>  {
> -	int seq;
> -
>       /*
>        * CPUs seeing atomic_add_return() must see prior RCU read-side
>        * critical sections, and we also must force ordering with the
>        * next idle sojourn.
>        */
>       rcu_task_trace_heavyweight_enter();  // Before CT state update!
> -	seq = ct_state_inc(offset);
> -	// RCU is no longer watching.  Better be in extended quiescent state!
> -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> +	// RCU is still watching.  Better not be in extended quiescent state!
> +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());

Isn't this equivalent to the check done in ct_kernel_enter_state()? That
is, it operates on the same context_tracking.state value that the
ct_kernel_enter_state() WARN_ON_ONCE() sees, so if the warning is to fire
it will fire there first.

I don't have any better idea than something like the ugly:

	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG)) {
		unsigned int new_state, state = atomic_read(&ct->state);
		bool ret;

		do {
			new_state = state + offset;
			// RCU will no longer be watching. Better be in extended quiescent state!
			WARN_ON_ONCE(new_state & CT_RCU_WATCHING);

			ret = atomic_try_cmpxchg(&ct->state, &state, new_state);
		} while (!ret);
	} else {
		(void)ct_state_inc(offset);
	}

> +	(void)ct_state_inc(offset);
> +	// RCU is no longer watching.
>  }
>
>  /*
Re: [PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Paul E. McKenney 1 year ago
On Wed, Feb 05, 2025 at 12:17:06PM +0100, Valentin Schneider wrote:
> On 01/02/25 10:44, Paul E. McKenney wrote:
> > The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
> > ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
> > This can (and does) result in extraneous lockdep warnings when this
> > WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
> > of helpful.
> >
> > Therefore, invert the WARN_ON_ONCE() condition and move it before the
> > call to ct_state_inc().  This does mean that the ct_state_inc() return
> > value can no longer be used in the WARN_ON_ONCE() condition, so discard
> > this return value and instead use a call to rcu_is_watching_curr_cpu().
> > This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
> > is no added overhead in production use.
> >
> > Reported-by: Breno Leitao <leitao@debian.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Valentin Schneider <vschneid@redhat.com>
> >
> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> > index 938c48952d26..fb5be6e9b423 100644
> > --- a/kernel/context_tracking.c
> > +++ b/kernel/context_tracking.c
> > @@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
> >   */
> >  static noinstr void ct_kernel_exit_state(int offset)
> >  {
> > -	int seq;
> > -
> >       /*
> >        * CPUs seeing atomic_add_return() must see prior RCU read-side
> >        * critical sections, and we also must force ordering with the
> >        * next idle sojourn.
> >        */
> >       rcu_task_trace_heavyweight_enter();  // Before CT state update!
> > -	seq = ct_state_inc(offset);
> > -	// RCU is no longer watching.  Better be in extended quiescent state!
> > -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> > +	// RCU is still watching.  Better not be in extended quiescent state!
> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
> 
> Isn't this equivalent to the check done in ct_kernel_enter_state()? That
> is, it operates on the same context_tracking.state value that the
> ct_kernel_enter_state() WARN_ON_ONCE() sees, so if the warning is to fire
> it will fire there first.

In theory, yes.  In practice, the bug we are trying to complain about
might well be due to that call to ct_kernel_enter_state() having been
left out completely.  Or, more likely, the call to one of its callers
having been left out completely.  So we cannot rely on its WARN_ON_ONCE()
to detect this sort of omitted-call bug.

And these omitted-call bugs do happen when bringing up new hardware or
implementing new exception paths for existing hardware.

> I don't have any better idea than something like the ugly:
> 
> 	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG)) {
> 		unsigned int new_state, state = atomic_read(&ct->state);
> 		bool ret;
> 
> 		do {
> 			new_state = state + offset;
> 			// RCU will no longer be watching. Better be in extended quiescent state!
> 			WARN_ON_ONCE(new_state & CT_RCU_WATCHING);
> 
> 			ret = atomic_try_cmpxchg(&ct->state, &state, new_state);
> 		} while (!ret);
> 	} else {
> 		(void)ct_state_inc(offset);
> 	}

This would make sense if we need to detect a bug in ct_state_inc() itself.
But that function is a one-liner invoking raw_atomic_add_return(),
and we have other tests to find bugs in atomics, correct?

Or am I missing a trick here?

							Thanx, Paul

> > +	(void)ct_state_inc(offset);
> > +	// RCU is no longer watching.
> >  }
> >
> >  /*
>
Re: [PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Valentin Schneider 1 year ago
On 05/02/25 04:16, Paul E. McKenney wrote:
> On Wed, Feb 05, 2025 at 12:17:06PM +0100, Valentin Schneider wrote:
>> On 01/02/25 10:44, Paul E. McKenney wrote:
>> > The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
>> > ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
>> > This can (and does) result in extraneous lockdep warnings when this
>> > WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
>> > of helpful.
>> >
>> > Therefore, invert the WARN_ON_ONCE() condition and move it before the
>> > call to ct_state_inc().  This does mean that the ct_state_inc() return
>> > value can no longer be used in the WARN_ON_ONCE() condition, so discard
>> > this return value and instead use a call to rcu_is_watching_curr_cpu().
>> > This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
>> > is no added overhead in production use.
>> >
>> > Reported-by: Breno Leitao <leitao@debian.org>
>> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>> > Cc: Frederic Weisbecker <frederic@kernel.org>
>> > Cc: Valentin Schneider <vschneid@redhat.com>
>> >
>> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
>> > index 938c48952d26..fb5be6e9b423 100644
>> > --- a/kernel/context_tracking.c
>> > +++ b/kernel/context_tracking.c
>> > @@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
>> >   */
>> >  static noinstr void ct_kernel_exit_state(int offset)
>> >  {
>> > -	int seq;
>> > -
>> >       /*
>> >        * CPUs seeing atomic_add_return() must see prior RCU read-side
>> >        * critical sections, and we also must force ordering with the
>> >        * next idle sojourn.
>> >        */
>> >       rcu_task_trace_heavyweight_enter();  // Before CT state update!
>> > -	seq = ct_state_inc(offset);
>> > -	// RCU is no longer watching.  Better be in extended quiescent state!
>> > -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
>> > +	// RCU is still watching.  Better not be in extended quiescent state!
>> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
>>
>> Isn't this equivalent to the check done in ct_kernel_enter_state()? That
>> is, it operates on the same context_tracking.state value that the
>> ct_kernel_enter_state() WARN_ON_ONCE() sees, so if the warning is to fire
>> it will fire there first.
>
> In theory, yes.  In practice, the bug we are trying to complain about
> might well be due to that call to ct_kernel_enter_state() having been
> left out completely.  Or, more likely, the call to one of its callers
> having been left out completely.  So we cannot rely on its WARN_ON_ONCE()
> to detect this sort of omitted-call bug.
>
> And these omitted-call bugs do happen when bringing up new hardware or
> implementing new exception paths for existing hardware.
>

Ah, quite so, it evens says so on the tin for ct_nmi_enter() & co.

>> I don't have any better idea than something like the ugly:
>>
>>      if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG)) {
>>              unsigned int new_state, state = atomic_read(&ct->state);
>>              bool ret;
>>
>>              do {
>>                      new_state = state + offset;
>>                      // RCU will no longer be watching. Better be in extended quiescent state!
>>                      WARN_ON_ONCE(new_state & CT_RCU_WATCHING);
>>
>>                      ret = atomic_try_cmpxchg(&ct->state, &state, new_state);
>>              } while (!ret);
>>      } else {
>>              (void)ct_state_inc(offset);
>>      }
>
> This would make sense if we need to detect a bug in ct_state_inc() itself.
> But that function is a one-liner invoking raw_atomic_add_return(),
> and we have other tests to find bugs in atomics, correct?
>
> Or am I missing a trick here?
>

Not at all; consider my suggestion revoked and my questioning answered :-)

Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Re: [PATCH RFC context_tracking] Make RCU watch ct_kernel_exit_state() warning
Posted by Paul E. McKenney 1 year ago
On Wed, Feb 05, 2025 at 03:45:55PM +0100, Valentin Schneider wrote:
> On 05/02/25 04:16, Paul E. McKenney wrote:
> > On Wed, Feb 05, 2025 at 12:17:06PM +0100, Valentin Schneider wrote:
> >> On 01/02/25 10:44, Paul E. McKenney wrote:
> >> > The WARN_ON_ONCE() in ct_kernel_exit_state() follows the call to
> >> > ct_state_inc(), which means that RCU is not watching this WARN_ON_ONCE().
> >> > This can (and does) result in extraneous lockdep warnings when this
> >> > WARN_ON_ONCE() triggers.  These extraneous warnings are the opposite
> >> > of helpful.
> >> >
> >> > Therefore, invert the WARN_ON_ONCE() condition and move it before the
> >> > call to ct_state_inc().  This does mean that the ct_state_inc() return
> >> > value can no longer be used in the WARN_ON_ONCE() condition, so discard
> >> > this return value and instead use a call to rcu_is_watching_curr_cpu().
> >> > This call is executed only in CONFIG_RCU_EQS_DEBUG=y kernels, so there
> >> > is no added overhead in production use.
> >> >
> >> > Reported-by: Breno Leitao <leitao@debian.org>
> >> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >> > Cc: Frederic Weisbecker <frederic@kernel.org>
> >> > Cc: Valentin Schneider <vschneid@redhat.com>
> >> >
> >> > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> >> > index 938c48952d26..fb5be6e9b423 100644
> >> > --- a/kernel/context_tracking.c
> >> > +++ b/kernel/context_tracking.c
> >> > @@ -80,17 +80,16 @@ static __always_inline void rcu_task_trace_heavyweight_exit(void)
> >> >   */
> >> >  static noinstr void ct_kernel_exit_state(int offset)
> >> >  {
> >> > -	int seq;
> >> > -
> >> >       /*
> >> >        * CPUs seeing atomic_add_return() must see prior RCU read-side
> >> >        * critical sections, and we also must force ordering with the
> >> >        * next idle sojourn.
> >> >        */
> >> >       rcu_task_trace_heavyweight_enter();  // Before CT state update!
> >> > -	seq = ct_state_inc(offset);
> >> > -	// RCU is no longer watching.  Better be in extended quiescent state!
> >> > -	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
> >> > +	// RCU is still watching.  Better not be in extended quiescent state!
> >> > +	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !rcu_is_watching_curr_cpu());
> >>
> >> Isn't this equivalent to the check done in ct_kernel_enter_state()? That
> >> is, it operates on the same context_tracking.state value that the
> >> ct_kernel_enter_state() WARN_ON_ONCE() sees, so if the warning is to fire
> >> it will fire there first.
> >
> > In theory, yes.  In practice, the bug we are trying to complain about
> > might well be due to that call to ct_kernel_enter_state() having been
> > left out completely.  Or, more likely, the call to one of its callers
> > having been left out completely.  So we cannot rely on its WARN_ON_ONCE()
> > to detect this sort of omitted-call bug.
> >
> > And these omitted-call bugs do happen when bringing up new hardware or
> > implementing new exception paths for existing hardware.
> 
> Ah, quite so, it evens says so on the tin for ct_nmi_enter() & co.
> 
> >> I don't have any better idea than something like the ugly:
> >>
> >>      if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG)) {
> >>              unsigned int new_state, state = atomic_read(&ct->state);
> >>              bool ret;
> >>
> >>              do {
> >>                      new_state = state + offset;
> >>                      // RCU will no longer be watching. Better be in extended quiescent state!
> >>                      WARN_ON_ONCE(new_state & CT_RCU_WATCHING);
> >>
> >>                      ret = atomic_try_cmpxchg(&ct->state, &state, new_state);
> >>              } while (!ret);
> >>      } else {
> >>              (void)ct_state_inc(offset);
> >>      }
> >
> > This would make sense if we need to detect a bug in ct_state_inc() itself.
> > But that function is a one-liner invoking raw_atomic_add_return(),
> > and we have other tests to find bugs in atomics, correct?
> >
> > Or am I missing a trick here?
> 
> Not at all; consider my suggestion revoked and my questioning answered :-)
> 
> Reviewed-by: Valentin Schneider <vschneid@redhat.com>

Thank you!  I will add this during my next rebase.

							Thanx, Paul