include/linux/rcutree.h | 2 +- include/linux/srcutiny.h | 2 +- kernel/rcu/Kconfig | 4 ++-- kernel/rcu/srcutiny.c | 14 +++++++------- kernel/rcu/tree_plugin.h | 11 +++++++---- kernel/sched/core.c | 3 ++- kernel/sched/debug.c | 7 +++++-- kernel/trace/trace_osnoise.c | 22 ++++++++++++---------- 8 files changed, 37 insertions(+), 28 deletions(-)
This series adds RCU and some leftover scheduler bits for lazy
preemption.
The main problem addressed in the RCU related patches is that before
PREEMPT_LAZY, PREEMPTION=y implied PREEMPT_RCU=y. With PREEMPT_LAZY,
that's no longer true.
That's because PREEMPT_RCU makes some trade-offs to optimize for
latency as opposed to throughput, and configurations with limited
preemption might prefer the stronger forward-progress guarantees of
PREEMPT_RCU=n.
Accordingly, with standalone PREEMPT_LAZY (much like PREEMPT_NONE,
PREEMPT_VOLUNTARY) we want to use PREEMPT_RCU=n. And, when used in
conjunction with PREEMPT_DYNAMIC, we continue to use PREEMPT_RCU=y.
Patches 1 and 2 are cleanup patches:
"rcu: fix header guard for rcu_all_qs()"
"rcu: rename PREEMPT_AUTO to PREEMPT_LAZY"
Patch 3, "rcu: limit PREEMPT_RCU configurations", explicitly limits
PREEMPT_RCU=y to the PREEMPT_DYNAMIC or the latency oriented models.
Patches 4 and 5,
"rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y"
"osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y"
handle quiescent states for the (PREEMPT_LAZY=y, PREEMPT_RCU=n)
configuration.
And, finally patch-6
"sched: warn for high latency with TIF_NEED_RESCHED_LAZY"
adds high latency warning for TIF_NEED_RESCHED_LAZY.
Goes on top of PeterZ's tree:
git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core
Changelog:
- fixup incorrect usage of tif_need_resched_lazy() (comment from
from Sebastian Andrzej Siewior)
- massaged the commit messages a bit
- drops the powerpc support for PREEMPT_LAZY as that was orthogonal
to this series (Shrikanth will send that out separately.)
Please review.
Ankur Arora (6):
rcu: fix header guard for rcu_all_qs()
rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
rcu: limit PREEMPT_RCU configurations
rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
sched: warn for high latency with TIF_NEED_RESCHED_LAZY
include/linux/rcutree.h | 2 +-
include/linux/srcutiny.h | 2 +-
kernel/rcu/Kconfig | 4 ++--
kernel/rcu/srcutiny.c | 14 +++++++-------
kernel/rcu/tree_plugin.h | 11 +++++++----
kernel/sched/core.c | 3 ++-
kernel/sched/debug.c | 7 +++++--
kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
8 files changed, 37 insertions(+), 28 deletions(-)
--
2.43.5
On 2024-11-06 12:17:52 [-0800], Ankur Arora wrote:
> This series adds RCU and some leftover scheduler bits for lazy
> preemption.
This is not critical for the current implementation. The way I
understand is that you make a change in 3/6 and then all other patches
in this series are required to deal with this.
For bisect reasons it would make sense to have 3/6 last in the series
and to the "fixes" first before the code is enabled. I mean if you apply
3/6 first then you get build failures without 1/6. But with 3/6 before
5/6 you should get runtime errors, right?
> The main problem addressed in the RCU related patches is that before
> PREEMPT_LAZY, PREEMPTION=y implied PREEMPT_RCU=y. With PREEMPT_LAZY,
> that's no longer true.
No, you want to make PREEMPTION=y + PREEMPT_RCU=n + PREEMPT_LAZY=y
possible. This is different. Your wording makes it sound like there _is_
an actual problem.
> That's because PREEMPT_RCU makes some trade-offs to optimize for
> latency as opposed to throughput, and configurations with limited
> preemption might prefer the stronger forward-progress guarantees of
> PREEMPT_RCU=n.
>
> Accordingly, with standalone PREEMPT_LAZY (much like PREEMPT_NONE,
> PREEMPT_VOLUNTARY) we want to use PREEMPT_RCU=n. And, when used in
> conjunction with PREEMPT_DYNAMIC, we continue to use PREEMPT_RCU=y.
>
> Patches 1 and 2 are cleanup patches:
> "rcu: fix header guard for rcu_all_qs()"
> "rcu: rename PREEMPT_AUTO to PREEMPT_LAZY"
>
> Patch 3, "rcu: limit PREEMPT_RCU configurations", explicitly limits
> PREEMPT_RCU=y to the PREEMPT_DYNAMIC or the latency oriented models.
>
> Patches 4 and 5,
> "rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y"
> "osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y"
>
> handle quiescent states for the (PREEMPT_LAZY=y, PREEMPT_RCU=n)
> configuration.
I was briefly thinking about
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5646,8 +5646,11 @@ void sched_tick(void)
hw_pressure = arch_scale_hw_pressure(cpu_of(rq));
update_hw_load_avg(rq_clock_task(rq), rq, hw_pressure);
- if (dynamic_preempt_lazy() && tif_test_bit(TIF_NEED_RESCHED_LAZY))
+ if (dynamic_preempt_lazy() && tif_test_bit(TIF_NEED_RESCHED_LAZY)) {
resched_curr(rq);
+ if (!IS_ENABLED(CONFIG_PREEMPT_RCU))
+ rcu_all_qs();
+ }
donor->sched_class->task_tick(rq, donor, 0);
if (sched_feat(LATENCY_WARN))
which should make #4+ #5 obsolete. But I think it is nicer to have the
change in #4 since it extends the check to cover all cases. And then
we would do it twice just for osnoise.
Sebastian
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-11-06 12:17:52 [-0800], Ankur Arora wrote:
>> This series adds RCU and some leftover scheduler bits for lazy
>> preemption.
>
> This is not critical for the current implementation. The way I
> understand is that you make a change in 3/6 and then all other patches
> in this series are required to deal with this.
>
> For bisect reasons it would make sense to have 3/6 last in the series
> and to the "fixes" first before the code is enabled. I mean if you apply
> 3/6 first then you get build failures without 1/6. But with 3/6 before
> 5/6 you should get runtime errors, right?
That's a good point. Will reorder.
>> The main problem addressed in the RCU related patches is that before
>> PREEMPT_LAZY, PREEMPTION=y implied PREEMPT_RCU=y. With PREEMPT_LAZY,
>> that's no longer true.
>
> No, you want to make PREEMPTION=y + PREEMPT_RCU=n + PREEMPT_LAZY=y
> possible. This is different. Your wording makes it sound like there _is_
> an actual problem.
That's too literal a reading. It's just the problem ("matter or
situation that is unwelcome" to quote from a dictionary) addressed in
the patches.
>> That's because PREEMPT_RCU makes some trade-offs to optimize for
>> latency as opposed to throughput, and configurations with limited
>> preemption might prefer the stronger forward-progress guarantees of
>> PREEMPT_RCU=n.
>>
>> Accordingly, with standalone PREEMPT_LAZY (much like PREEMPT_NONE,
>> PREEMPT_VOLUNTARY) we want to use PREEMPT_RCU=n. And, when used in
>> conjunction with PREEMPT_DYNAMIC, we continue to use PREEMPT_RCU=y.
>>
>> Patches 1 and 2 are cleanup patches:
>> "rcu: fix header guard for rcu_all_qs()"
>> "rcu: rename PREEMPT_AUTO to PREEMPT_LAZY"
>>
>> Patch 3, "rcu: limit PREEMPT_RCU configurations", explicitly limits
>> PREEMPT_RCU=y to the PREEMPT_DYNAMIC or the latency oriented models.
>>
>> Patches 4 and 5,
>> "rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y"
>> "osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y"
>>
>> handle quiescent states for the (PREEMPT_LAZY=y, PREEMPT_RCU=n)
>> configuration.
>
> I was briefly thinking about
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5646,8 +5646,11 @@ void sched_tick(void)
> hw_pressure = arch_scale_hw_pressure(cpu_of(rq));
> update_hw_load_avg(rq_clock_task(rq), rq, hw_pressure);
>
> - if (dynamic_preempt_lazy() && tif_test_bit(TIF_NEED_RESCHED_LAZY))
> + if (dynamic_preempt_lazy() && tif_test_bit(TIF_NEED_RESCHED_LAZY)) {
> resched_curr(rq);
> + if (!IS_ENABLED(CONFIG_PREEMPT_RCU))
> + rcu_all_qs();
> + }
>
> donor->sched_class->task_tick(rq, donor, 0);
> if (sched_feat(LATENCY_WARN))
>
> which should make #4+ #5 obsolete. But I think it is nicer to have the
> change in #4 since it extends the check to cover all cases. And then
> we would do it twice just for osnoise.
Yeah, exactly. The check here only deals with this specific case
while the one in rcu_flavor_sched_clock_irq() can handle that more
generally.
Thanks.
--
ankur
© 2016 - 2026 Red Hat, Inc.