To reduce RCU noise for nohz_full configurations, osnoise depends
on cond_resched() providing quiescent states for PREEMPT_RCU=n
configurations. And, for PREEMPT_RCU=y configurations does this
by directly calling rcu_momentary_eqs().
With PREEMPT_LAZY=y, however, we can have configurations with
(PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
can help.
Handle that by fallback to the explicit quiescent states via
rcu_momentary_eqs().
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index a50ed23bee77..15e9600d231d 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1538,18 +1538,20 @@ static int run_osnoise(void)
/*
* In some cases, notably when running on a nohz_full CPU with
* a stopped tick PREEMPT_RCU has no way to account for QSs.
- * This will eventually cause unwarranted noise as PREEMPT_RCU
- * will force preemption as the means of ending the current
- * grace period. We avoid this problem by calling
- * rcu_momentary_eqs(), which performs a zero duration
- * EQS allowing PREEMPT_RCU to end the current grace period.
- * This call shouldn't be wrapped inside an RCU critical
- * section.
+ * This will eventually cause unwarranted noise as RCU forces
+ * preemption as the means of ending the current grace period.
+ * We avoid this by calling rcu_momentary_eqs(), which performs
+ * a zero duration EQS allowing RCU to end the current grace
+ * period. This call shouldn't be wrapped inside an RCU
+ * critical section.
*
- * Note that in non PREEMPT_RCU kernels QSs are handled through
- * cond_resched()
+ * For non-PREEMPT_RCU kernels with cond_resched() (non-
+ * PREEMPT_LAZY configurations), QSs are handled through
+ * cond_resched(). For PREEMPT_LAZY kernels, we fallback to
+ * the zero duration QS via rcu_momentary_eqs().
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
+ if (IS_ENABLED(CONFIG_PREEMPT_RCU) ||
+ (!IS_ENABLED(CONFIG_PREEMPT_RCU) && IS_ENABLED(CONFIG_PREEMPTION))) {
if (!disable_irq)
local_irq_disable();
--
2.43.5
Le Wed, Nov 06, 2024 at 12:17:57PM -0800, Ankur Arora a écrit :
> To reduce RCU noise for nohz_full configurations, osnoise depends
> on cond_resched() providing quiescent states for PREEMPT_RCU=n
> configurations. And, for PREEMPT_RCU=y configurations does this
> by directly calling rcu_momentary_eqs().
>
> With PREEMPT_LAZY=y, however, we can have configurations with
> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
> can help.
>
> Handle that by fallback to the explicit quiescent states via
> rcu_momentary_eqs().
>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> index a50ed23bee77..15e9600d231d 100644
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -1538,18 +1538,20 @@ static int run_osnoise(void)
> /*
> * In some cases, notably when running on a nohz_full CPU with
> * a stopped tick PREEMPT_RCU has no way to account for QSs.
> - * This will eventually cause unwarranted noise as PREEMPT_RCU
> - * will force preemption as the means of ending the current
> - * grace period. We avoid this problem by calling
> - * rcu_momentary_eqs(), which performs a zero duration
> - * EQS allowing PREEMPT_RCU to end the current grace period.
> - * This call shouldn't be wrapped inside an RCU critical
> - * section.
> + * This will eventually cause unwarranted noise as RCU forces
> + * preemption as the means of ending the current grace period.
> + * We avoid this by calling rcu_momentary_eqs(), which performs
> + * a zero duration EQS allowing RCU to end the current grace
> + * period. This call shouldn't be wrapped inside an RCU
> + * critical section.
> *
> - * Note that in non PREEMPT_RCU kernels QSs are handled through
> - * cond_resched()
> + * For non-PREEMPT_RCU kernels with cond_resched() (non-
> + * PREEMPT_LAZY configurations), QSs are handled through
> + * cond_resched(). For PREEMPT_LAZY kernels, we fallback to
> + * the zero duration QS via rcu_momentary_eqs().
> */
> - if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
> + if (IS_ENABLED(CONFIG_PREEMPT_RCU) ||
> + (!IS_ENABLED(CONFIG_PREEMPT_RCU) && IS_ENABLED(CONFIG_PREEMPTION))) {
> if (!disable_irq)
> local_irq_disable();
How about making this unconditional so it works everywhere and doesn't
rely on cond_resched() Kconfig/preempt-dynamic mood?
Thanks.
>
> --
> 2.43.5
>
Frederic Weisbecker <frederic@kernel.org> writes:
> Le Wed, Nov 06, 2024 at 12:17:57PM -0800, Ankur Arora a écrit :
>> To reduce RCU noise for nohz_full configurations, osnoise depends
>> on cond_resched() providing quiescent states for PREEMPT_RCU=n
>> configurations. And, for PREEMPT_RCU=y configurations does this
>> by directly calling rcu_momentary_eqs().
>>
>> With PREEMPT_LAZY=y, however, we can have configurations with
>> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
>> can help.
>>
>> Handle that by fallback to the explicit quiescent states via
>> rcu_momentary_eqs().
>>
>> Cc: Paul E. McKenney <paulmck@kernel.org>
>> Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
>> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
>> 1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
>> index a50ed23bee77..15e9600d231d 100644
>> --- a/kernel/trace/trace_osnoise.c
>> +++ b/kernel/trace/trace_osnoise.c
>> @@ -1538,18 +1538,20 @@ static int run_osnoise(void)
>> /*
>> * In some cases, notably when running on a nohz_full CPU with
>> * a stopped tick PREEMPT_RCU has no way to account for QSs.
>> - * This will eventually cause unwarranted noise as PREEMPT_RCU
>> - * will force preemption as the means of ending the current
>> - * grace period. We avoid this problem by calling
>> - * rcu_momentary_eqs(), which performs a zero duration
>> - * EQS allowing PREEMPT_RCU to end the current grace period.
>> - * This call shouldn't be wrapped inside an RCU critical
>> - * section.
>> + * This will eventually cause unwarranted noise as RCU forces
>> + * preemption as the means of ending the current grace period.
>> + * We avoid this by calling rcu_momentary_eqs(), which performs
>> + * a zero duration EQS allowing RCU to end the current grace
>> + * period. This call shouldn't be wrapped inside an RCU
>> + * critical section.
>> *
>> - * Note that in non PREEMPT_RCU kernels QSs are handled through
>> - * cond_resched()
>> + * For non-PREEMPT_RCU kernels with cond_resched() (non-
>> + * PREEMPT_LAZY configurations), QSs are handled through
>> + * cond_resched(). For PREEMPT_LAZY kernels, we fallback to
>> + * the zero duration QS via rcu_momentary_eqs().
>> */
>> - if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
>> + if (IS_ENABLED(CONFIG_PREEMPT_RCU) ||
>> + (!IS_ENABLED(CONFIG_PREEMPT_RCU) && IS_ENABLED(CONFIG_PREEMPTION))) {
>> if (!disable_irq)
>> local_irq_disable();
>
> How about making this unconditional so it works everywhere and doesn't
> rely on cond_resched() Kconfig/preempt-dynamic mood?
I think it's a minor matter given that this isn't a hot path, but
we don't really need it for the !PREEMPT_RCU configuration.
Still, given that both of those clauses imply CONFIG_PREEMPTION, we
can just simplify this to (with an appropriately adjusted comment):
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1543,7 +1543,7 @@ static int run_osnoise(void)
* Note that in non PREEMPT_RCU kernels QSs are handled through
* cond_resched()
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
+ if (IS_ENABLED(CONFIG_PREEMPTION)) {
if (!disable_irq)
local_irq_disable();
--
ankur
Le Thu, Nov 28, 2024 at 09:03:56PM -0800, Ankur Arora a écrit :
>
> Frederic Weisbecker <frederic@kernel.org> writes:
>
> > Le Wed, Nov 06, 2024 at 12:17:57PM -0800, Ankur Arora a écrit :
> >> To reduce RCU noise for nohz_full configurations, osnoise depends
> >> on cond_resched() providing quiescent states for PREEMPT_RCU=n
> >> configurations. And, for PREEMPT_RCU=y configurations does this
> >> by directly calling rcu_momentary_eqs().
> >>
> >> With PREEMPT_LAZY=y, however, we can have configurations with
> >> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
> >> can help.
> >>
> >> Handle that by fallback to the explicit quiescent states via
> >> rcu_momentary_eqs().
> >>
> >> Cc: Paul E. McKenney <paulmck@kernel.org>
> >> Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
> >> Cc: Steven Rostedt <rostedt@goodmis.org>
> >> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> >> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
> >> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> >> ---
> >> kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
> >> 1 file changed, 12 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> >> index a50ed23bee77..15e9600d231d 100644
> >> --- a/kernel/trace/trace_osnoise.c
> >> +++ b/kernel/trace/trace_osnoise.c
> >> @@ -1538,18 +1538,20 @@ static int run_osnoise(void)
> >> /*
> >> * In some cases, notably when running on a nohz_full CPU with
> >> * a stopped tick PREEMPT_RCU has no way to account for QSs.
> >> - * This will eventually cause unwarranted noise as PREEMPT_RCU
> >> - * will force preemption as the means of ending the current
> >> - * grace period. We avoid this problem by calling
> >> - * rcu_momentary_eqs(), which performs a zero duration
> >> - * EQS allowing PREEMPT_RCU to end the current grace period.
> >> - * This call shouldn't be wrapped inside an RCU critical
> >> - * section.
> >> + * This will eventually cause unwarranted noise as RCU forces
> >> + * preemption as the means of ending the current grace period.
> >> + * We avoid this by calling rcu_momentary_eqs(), which performs
> >> + * a zero duration EQS allowing RCU to end the current grace
> >> + * period. This call shouldn't be wrapped inside an RCU
> >> + * critical section.
> >> *
> >> - * Note that in non PREEMPT_RCU kernels QSs are handled through
> >> - * cond_resched()
> >> + * For non-PREEMPT_RCU kernels with cond_resched() (non-
> >> + * PREEMPT_LAZY configurations), QSs are handled through
> >> + * cond_resched(). For PREEMPT_LAZY kernels, we fallback to
> >> + * the zero duration QS via rcu_momentary_eqs().
> >> */
> >> - if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
> >> + if (IS_ENABLED(CONFIG_PREEMPT_RCU) ||
> >> + (!IS_ENABLED(CONFIG_PREEMPT_RCU) && IS_ENABLED(CONFIG_PREEMPTION))) {
> >> if (!disable_irq)
> >> local_irq_disable();
> >
> > How about making this unconditional so it works everywhere and doesn't
> > rely on cond_resched() Kconfig/preempt-dynamic mood?
>
> I think it's a minor matter given that this isn't a hot path, but
> we don't really need it for the !PREEMPT_RCU configuration.
Well if you make it unconditional, cond_resched() / rcu_all_qs() won't do its
own rcu_momentary_qs(), because rcu_data.rcu_urgent_qs should
be false. So that essentially unify the behaviours for all configurations.
Thanks.
>
> Still, given that both of those clauses imply CONFIG_PREEMPTION, we
> can just simplify this to (with an appropriately adjusted comment):
>
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -1543,7 +1543,7 @@ static int run_osnoise(void)
> * Note that in non PREEMPT_RCU kernels QSs are handled through
> * cond_resched()
> */
> - if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
> + if (IS_ENABLED(CONFIG_PREEMPTION)) {
> if (!disable_irq)
> local_irq_disable();
>
> --
> ankur
Frederic Weisbecker <frederic@kernel.org> writes:
> Le Thu, Nov 28, 2024 at 09:03:56PM -0800, Ankur Arora a écrit :
>>
>> Frederic Weisbecker <frederic@kernel.org> writes:
>>
>> > Le Wed, Nov 06, 2024 at 12:17:57PM -0800, Ankur Arora a écrit :
>> >> To reduce RCU noise for nohz_full configurations, osnoise depends
>> >> on cond_resched() providing quiescent states for PREEMPT_RCU=n
>> >> configurations. And, for PREEMPT_RCU=y configurations does this
>> >> by directly calling rcu_momentary_eqs().
>> >>
>> >> With PREEMPT_LAZY=y, however, we can have configurations with
>> >> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
>> >> can help.
>> >>
>> >> Handle that by fallback to the explicit quiescent states via
>> >> rcu_momentary_eqs().
>> >>
>> >> Cc: Paul E. McKenney <paulmck@kernel.org>
>> >> Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
>> >> Cc: Steven Rostedt <rostedt@goodmis.org>
>> >> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
>> >> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
>> >> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> >> ---
>> >> kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
>> >> 1 file changed, 12 insertions(+), 10 deletions(-)
>> >>
>> >> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
>> >> index a50ed23bee77..15e9600d231d 100644
>> >> --- a/kernel/trace/trace_osnoise.c
>> >> +++ b/kernel/trace/trace_osnoise.c
>> >> @@ -1538,18 +1538,20 @@ static int run_osnoise(void)
>> >> /*
>> >> * In some cases, notably when running on a nohz_full CPU with
>> >> * a stopped tick PREEMPT_RCU has no way to account for QSs.
>> >> - * This will eventually cause unwarranted noise as PREEMPT_RCU
>> >> - * will force preemption as the means of ending the current
>> >> - * grace period. We avoid this problem by calling
>> >> - * rcu_momentary_eqs(), which performs a zero duration
>> >> - * EQS allowing PREEMPT_RCU to end the current grace period.
>> >> - * This call shouldn't be wrapped inside an RCU critical
>> >> - * section.
>> >> + * This will eventually cause unwarranted noise as RCU forces
>> >> + * preemption as the means of ending the current grace period.
>> >> + * We avoid this by calling rcu_momentary_eqs(), which performs
>> >> + * a zero duration EQS allowing RCU to end the current grace
>> >> + * period. This call shouldn't be wrapped inside an RCU
>> >> + * critical section.
>> >> *
>> >> - * Note that in non PREEMPT_RCU kernels QSs are handled through
>> >> - * cond_resched()
>> >> + * For non-PREEMPT_RCU kernels with cond_resched() (non-
>> >> + * PREEMPT_LAZY configurations), QSs are handled through
>> >> + * cond_resched(). For PREEMPT_LAZY kernels, we fallback to
>> >> + * the zero duration QS via rcu_momentary_eqs().
>> >> */
>> >> - if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
>> >> + if (IS_ENABLED(CONFIG_PREEMPT_RCU) ||
>> >> + (!IS_ENABLED(CONFIG_PREEMPT_RCU) && IS_ENABLED(CONFIG_PREEMPTION))) {
>> >> if (!disable_irq)
>> >> local_irq_disable();
>> >
>> > How about making this unconditional so it works everywhere and doesn't
>> > rely on cond_resched() Kconfig/preempt-dynamic mood?
>>
>> I think it's a minor matter given that this isn't a hot path, but
>> we don't really need it for the !PREEMPT_RCU configuration.
>
> Well if you make it unconditional, cond_resched() / rcu_all_qs() won't do its
> own rcu_momentary_qs(), because rcu_data.rcu_urgent_qs should
> be false. So that essentially unify the behaviours for all configurations.
Ah, yes. That makes sense.
Ankur
>>
>> Still, given that both of those clauses imply CONFIG_PREEMPTION, we
>> can just simplify this to (with an appropriately adjusted comment):
>>
>> --- a/kernel/trace/trace_osnoise.c
>> +++ b/kernel/trace/trace_osnoise.c
>> @@ -1543,7 +1543,7 @@ static int run_osnoise(void)
>> * Note that in non PREEMPT_RCU kernels QSs are handled through
>> * cond_resched()
>> */
>> - if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
>> + if (IS_ENABLED(CONFIG_PREEMPTION)) {
>> if (!disable_irq)
>> local_irq_disable();
>>
>> --
>> ankur
--
ankur
On 2024-11-06 12:17:57 [-0800], Ankur Arora wrote: > To reduce RCU noise for nohz_full configurations, osnoise depends > on cond_resched() providing quiescent states for PREEMPT_RCU=n > configurations. And, for PREEMPT_RCU=y configurations does this > by directly calling rcu_momentary_eqs(). > > With PREEMPT_LAZY=y, however, we can have configurations with > (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above > can help. The problem is as you say CONFIG_PREEMPT_RCU=n + CONFIG_PREEMPTION=y. You can't select any of those two directly but get here via PREEMPT_LAZY=y + PREEMPT_DYNAMIC=n. Please spell it out to make it obvious. It is not a large group of configurations, it is exactly this combo. With PREEMPT_LAZY=y + PREEMPT_DYNAMIC=n however we get PREEMPT_RCU=n which means no direct rcu_momentary_eqs() invocations and cond_resched() is an empty stub. > Handle that by fallback to the explicit quiescent states via > rcu_momentary_eqs(). > Cc: Paul E. McKenney <paulmck@kernel.org> > Cc: Daniel Bristot de Oliveira <bristot@kernel.org> > Cc: Steven Rostedt <rostedt@goodmis.org> > Suggested-by: Paul E. McKenney <paulmck@kernel.org> > Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> > Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Sebastian
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > On 2024-11-06 12:17:57 [-0800], Ankur Arora wrote: >> To reduce RCU noise for nohz_full configurations, osnoise depends >> on cond_resched() providing quiescent states for PREEMPT_RCU=n >> configurations. And, for PREEMPT_RCU=y configurations does this >> by directly calling rcu_momentary_eqs(). >> >> With PREEMPT_LAZY=y, however, we can have configurations with >> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above >> can help. > > The problem is as you say CONFIG_PREEMPT_RCU=n + CONFIG_PREEMPTION=y. > You can't select any of those two directly but get here via > PREEMPT_LAZY=y + PREEMPT_DYNAMIC=n. > > Please spell it out to make it obvious. It is not a large group of > configurations, it is exactly this combo. Makes sense. Will do. Thanks. Ankur > With PREEMPT_LAZY=y + PREEMPT_DYNAMIC=n however we get PREEMPT_RCU=n > which means no direct rcu_momentary_eqs() invocations and > cond_resched() is an empty stub. > >> Handle that by fallback to the explicit quiescent states via >> rcu_momentary_eqs(). > >> Cc: Paul E. McKenney <paulmck@kernel.org> >> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> >> Cc: Steven Rostedt <rostedt@goodmis.org> >> Suggested-by: Paul E. McKenney <paulmck@kernel.org> >> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> >> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> > > Sebastian -- ankur
© 2016 - 2026 Red Hat, Inc.