With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
states for read-side critical sections via rcu_all_qs().
One reason why this was needed, was lacking preempt-count, the tick
handler has no way of knowing whether it is executing in a read-side
critical section or not.
With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
quiescent states via rcu_all_qs().
So, use the availability of preempt_count() to report quiescent states
in rcu_flavor_sched_clock_irq().
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/rcu/tree_plugin.h | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 1c7cbd145d5e..da324d66034b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
*/
static void rcu_flavor_sched_clock_irq(int user)
{
- if (user || rcu_is_cpu_rrupt_from_idle()) {
+ if (user || rcu_is_cpu_rrupt_from_idle() ||
+ (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
+ !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) {
/*
* Get here if this CPU took its interrupt from user
- * mode or from the idle loop, and if this is not a
- * nested interrupt. In this case, the CPU is in
- * a quiescent state, so note it.
+ * mode, from the idle loop without this being a nested
+ * interrupt, or while not holding a preempt count (but
+ * with PREEMPT_COUNT=y. In this case, the CPU is in a
+ * quiescent state, so note it.
*
* No memory barrier is required here because rcu_qs()
* references only CPU-local variables that other CPUs
--
2.43.5
On 2024-10-09 09:54:08 [-0700], Ankur Arora wrote: > With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent > states for read-side critical sections via rcu_all_qs(). > One reason why this was needed, was lacking preempt-count, the tick > handler has no way of knowing whether it is executing in a read-side > critical section or not. > > With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y, > PREEMPT_RCU=n, where cond_resched() is a stub that does not provide > quiescent states via rcu_all_qs(). PREEMPT_LAZY selects PREEMPT_BUILD which selects PREEMPTION which in turn selects PREEMPT_RCU. So this is not a valid combination. Do you have a different tree than I do? Because maybe I am missing something. > So, use the availability of preempt_count() to report quiescent states > in rcu_flavor_sched_clock_irq(). > > Suggested-by: Paul E. McKenney <paulmck@kernel.org> > Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> > --- > kernel/rcu/tree_plugin.h | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index 1c7cbd145d5e..da324d66034b 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > */ > static void rcu_flavor_sched_clock_irq(int user) > { > - if (user || rcu_is_cpu_rrupt_from_idle()) { > + if (user || rcu_is_cpu_rrupt_from_idle() || > + (IS_ENABLED(CONFIG_PREEMPT_COUNT) && > + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) { couldn't you use a helper preemptible()? > /* > * Get here if this CPU took its interrupt from user > - * mode or from the idle loop, and if this is not a > - * nested interrupt. In this case, the CPU is in > - * a quiescent state, so note it. > + * mode, from the idle loop without this being a nested > + * interrupt, or while not holding a preempt count (but > + * with PREEMPT_COUNT=y. In this case, the CPU is in a > + * quiescent state, so note it. > * > * No memory barrier is required here because rcu_qs() > * references only CPU-local variables that other CPUs Sebastian
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > On 2024-10-09 09:54:08 [-0700], Ankur Arora wrote: >> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent >> states for read-side critical sections via rcu_all_qs(). >> One reason why this was needed, was lacking preempt-count, the tick >> handler has no way of knowing whether it is executing in a read-side >> critical section or not. >> >> With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y, >> PREEMPT_RCU=n, where cond_resched() is a stub that does not provide >> quiescent states via rcu_all_qs(). > > PREEMPT_LAZY selects PREEMPT_BUILD which selects PREEMPTION which in > turn selects PREEMPT_RCU. So this is not a valid combination. Do you > have a different tree than I do? Because maybe I am missing something. The second patch in the series? >> So, use the availability of preempt_count() to report quiescent states >> in rcu_flavor_sched_clock_irq(). >> >> Suggested-by: Paul E. McKenney <paulmck@kernel.org> >> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> >> --- >> kernel/rcu/tree_plugin.h | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h >> index 1c7cbd145d5e..da324d66034b 100644 >> --- a/kernel/rcu/tree_plugin.h >> +++ b/kernel/rcu/tree_plugin.h >> @@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) >> */ >> static void rcu_flavor_sched_clock_irq(int user) >> { >> - if (user || rcu_is_cpu_rrupt_from_idle()) { >> + if (user || rcu_is_cpu_rrupt_from_idle() || >> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) && >> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) { > > couldn't you use a helper preemptible()? Alas no. This check isn't trying to establish preemptibility (this is called in irq context so we already know that we aren't preemptible.) The check is using the preempt count to see if it can infer the state of RCU read side critical section on this CPU. >> /* >> * Get here if this CPU took its interrupt from user >> - * mode or from the idle loop, and if this is not a >> - * nested interrupt. In this case, the CPU is in >> - * a quiescent state, so note it. >> + * mode, from the idle loop without this being a nested >> + * interrupt, or while not holding a preempt count (but >> + * with PREEMPT_COUNT=y. In this case, the CPU is in a >> + * quiescent state, so note it. >> * >> * No memory barrier is required here because rcu_qs() >> * references only CPU-local variables that other CPUs > > Sebastian -- ankur
On 2024-10-10 10:56:36 [-0700], Ankur Arora wrote: > > > > PREEMPT_LAZY selects PREEMPT_BUILD which selects PREEMPTION which in > > turn selects PREEMPT_RCU. So this is not a valid combination. Do you > > have a different tree than I do? Because maybe I am missing something. > > The second patch in the series? As long as "PREEMPT_RCU" as no text behind its bool you can't select it independently. > >> --- a/kernel/rcu/tree_plugin.h > >> +++ b/kernel/rcu/tree_plugin.h > >> @@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > >> */ > >> static void rcu_flavor_sched_clock_irq(int user) > >> { > >> - if (user || rcu_is_cpu_rrupt_from_idle()) { > >> + if (user || rcu_is_cpu_rrupt_from_idle() || > >> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) && > >> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) { > > > > couldn't you use a helper preemptible()? > > Alas no. This check isn't trying to establish preemptibility (this is > called in irq context so we already know that we aren't preemptible.) > The check is using the preempt count to see if it can infer the state > of RCU read side critical section on this CPU. I see. Sebastian
Ankur Arora <ankur.a.arora@oracle.com> writes: > With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent > states for read-side critical sections via rcu_all_qs(). > One reason why this was needed, was lacking preempt-count, the tick > handler has no way of knowing whether it is executing in a read-side > critical section or not. > > With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y, > PREEMPT_RCU=n, where cond_resched() is a stub that does not provide > quiescent states via rcu_all_qs(). > > So, use the availability of preempt_count() to report quiescent states > in rcu_flavor_sched_clock_irq(). A note about the inverse of this case, where we might have long running loops which only temporarily enable preemption and thus would be unlikely to align themselves with the tick: in prior discussions [1] Paul had pointed the need for providing for forcing a context switch in such a scenario. I had a patch which did that, but I think it is unnecessary since this clause in rcu_sched_clock_irq() should already handle it. void rcu_sched_clock_irq(int user) { ... /* The load-acquire pairs with the store-release setting to true. */ if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { /* Idle and userspace execution already are quiescent states. */ if (!rcu_is_cpu_rrupt_from_idle() && !user) { set_tsk_need_resched(current); set_preempt_need_resched(); } __this_cpu_write(rcu_data.rcu_urgent_qs, false); } Paul? -- ankur
On Wed, Oct 09, 2024 at 12:05:47PM -0700, Ankur Arora wrote: > > > Ankur Arora <ankur.a.arora@oracle.com> writes: > > > With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent > > states for read-side critical sections via rcu_all_qs(). > > One reason why this was needed, was lacking preempt-count, the tick > > handler has no way of knowing whether it is executing in a read-side > > critical section or not. > > > > With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y, > > PREEMPT_RCU=n, where cond_resched() is a stub that does not provide > > quiescent states via rcu_all_qs(). > > > > So, use the availability of preempt_count() to report quiescent states > > in rcu_flavor_sched_clock_irq(). > > A note about the inverse of this case, where we might have long running > loops which only temporarily enable preemption and thus would be > unlikely to align themselves with the tick: in prior discussions [1] > Paul had pointed the need for providing for forcing a context switch > in such a scenario. > > I had a patch which did that, but I think it is unnecessary since this > clause in rcu_sched_clock_irq() should already handle it. > > void rcu_sched_clock_irq(int user) { > ... > /* The load-acquire pairs with the store-release setting to true. */ > if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { > /* Idle and userspace execution already are quiescent states. */ > if (!rcu_is_cpu_rrupt_from_idle() && !user) { > set_tsk_need_resched(current); > set_preempt_need_resched(); > } > __this_cpu_write(rcu_data.rcu_urgent_qs, false); > } > > Paul? As long as the tick is actually enabled. But looking deeper, there is code in force_qs_rnp() and rcu_watching_snap_recheck() to force the tick on CPUs that don't response to the grace period soon enough via the -1 return from the rcu_watching_snap_recheck() function and via resched_cpu(). So we might be covered. Some serious testing is warranted. Thanx, Paul
Paul E. McKenney <paulmck@kernel.org> writes: > On Wed, Oct 09, 2024 at 12:05:47PM -0700, Ankur Arora wrote: >> >> >> Ankur Arora <ankur.a.arora@oracle.com> writes: >> >> > With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent >> > states for read-side critical sections via rcu_all_qs(). >> > One reason why this was needed, was lacking preempt-count, the tick >> > handler has no way of knowing whether it is executing in a read-side >> > critical section or not. >> > >> > With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y, >> > PREEMPT_RCU=n, where cond_resched() is a stub that does not provide >> > quiescent states via rcu_all_qs(). >> > >> > So, use the availability of preempt_count() to report quiescent states >> > in rcu_flavor_sched_clock_irq(). >> >> A note about the inverse of this case, where we might have long running >> loops which only temporarily enable preemption and thus would be >> unlikely to align themselves with the tick: in prior discussions [1] >> Paul had pointed the need for providing for forcing a context switch >> in such a scenario. >> >> I had a patch which did that, but I think it is unnecessary since this >> clause in rcu_sched_clock_irq() should already handle it. >> >> void rcu_sched_clock_irq(int user) { >> ... >> /* The load-acquire pairs with the store-release setting to true. */ >> if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) { >> /* Idle and userspace execution already are quiescent states. */ >> if (!rcu_is_cpu_rrupt_from_idle() && !user) { >> set_tsk_need_resched(current); >> set_preempt_need_resched(); >> } >> __this_cpu_write(rcu_data.rcu_urgent_qs, false); >> } >> >> Paul? > > As long as the tick is actually enabled. > > But looking deeper, there is code in force_qs_rnp() and > rcu_watching_snap_recheck() to force the tick on CPUs that don't > response to the grace period soon enough via the -1 return from the > rcu_watching_snap_recheck() function and via resched_cpu(). Ah. I had missed that path. Thanks. -- ankur
© 2016 - 2024 Red Hat, Inc.