kernel/sched/fair.c | 4 ++++ 1 file changed, 4 insertions(+)
Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ
kick path") removed the rcu_read_lock()/unlock() pair from
set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption
that all callers run in a safe context for rcu_dereference_all(): IRQs
disabled or cpus_write_lock() held.
That assumption is wrong for the CPU hotplug teardown path. When CPUs
are taken offline, set_cpu_sd_state_busy() is invoked via:
cpuhp/N kthread
cpuhp_thread_fun()
cpuhp_invoke_callback()
sched_cpu_deactivate()
nohz_balance_exit_idle()
set_cpu_sd_state_busy()
rcu_dereference_all(per_cpu(sd_llc, cpu))
The cpuhp kthread holds cpu_hotplug_lock (percpu-rwsem) but runs with
preemption and IRQs enabled. As a result, lockdep correctly reports a
suspicious RCU usage on CPU offline, e.g.:
# echo 0 > /sys/devices/system/cpu/cpu1/online
=============================
WARNING: suspicious RCU usage
-----------------------------
kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
...
2 locks held by cpuhp/1/20:
#0: (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
#1: (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae
Call Trace:
lockdep_rcu_suspicious
nohz_balance_exit_idle
sched_cpu_deactivate
cpuhp_invoke_callback
cpuhp_thread_fun
smpboot_thread_fn
Restore RCU locking in both helpers, nohz_balancer_kick() is left as is,
since its IRQ-disabled context is genuinely sufficient.
Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com/
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
Changes in v2:
- Use guard(rcu)() instead of rcu_read_lock/unlock() (Prateek)
- Link to v1: https://lore.kernel.org/all/20260521205115.1689545-1-arighi@nvidia.com
kernel/sched/fair.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aac24cfddecdf..49ec9208bb6c4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -14070,6 +14070,8 @@ static void nohz_balancer_kick(struct rq *rq)
static void set_cpu_sd_state_busy(int cpu)
{
struct sched_domain *sd;
+
+ guard(rcu)();
sd = rcu_dereference_all(per_cpu(sd_llc, cpu));
/*
@@ -14099,6 +14101,8 @@ void nohz_balance_exit_idle(struct rq *rq)
static void set_cpu_sd_state_idle(int cpu)
{
struct sched_domain *sd;
+
+ guard(rcu)();
sd = rcu_dereference_all(per_cpu(sd_llc, cpu));
/* See set_cpu_sd_state_busy(): nohz_idle is only used with sd->shared. */
base-commit: b07a332d9cbbd9cc9cfa923a21bd061cfb69bea5
--
2.54.0
© 2016 - 2026 Red Hat, Inc.