[PATCH v3] sched/fair: Fix RCU usage in NOHZ exit path on CPU offline

Andrea Righi posted 1 patch 2 days, 8 hours ago
kernel/sched/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH v3] sched/fair: Fix RCU usage in NOHZ exit path on CPU offline
Posted by Andrea Righi 2 days, 8 hours ago
Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ
kick path") removed the rcu_read_lock()/unlock() pair from
set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption
that all callers run in a safe context for rcu_dereference_all(): IRQs
disabled or cpus_write_lock() held.

That assumption is wrong for the CPU hotplug teardown path. When CPUs
are taken offline, set_cpu_sd_state_busy() is invoked via:

 cpuhp/N kthread
   cpuhp_thread_fun()
     cpuhp_invoke_callback()
       sched_cpu_deactivate()
         nohz_balance_exit_idle()
           set_cpu_sd_state_busy()
             rcu_dereference_all(per_cpu(sd_llc, cpu))

The cpuhp kthread holds cpu_hotplug_lock (percpu-rwsem) but runs with
preemption and IRQs enabled. As a result, lockdep correctly reports a
suspicious RCU usage on CPU offline, e.g.:

  # echo 0 > /sys/devices/system/cpu/cpu1/online

  =============================
  WARNING: suspicious RCU usage
  -----------------------------
  kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
  ...
  2 locks held by cpuhp/1/20:
   #0: (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
   #1: (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae

  Call Trace:
    lockdep_rcu_suspicious
    nohz_balance_exit_idle
    sched_cpu_deactivate
    cpuhp_invoke_callback
    cpuhp_thread_fun
    smpboot_thread_fn

Fix this by adding RCU read lock coverage to the one caller that lacks
it: nohz_balance_exit_idle() in the CPU hotplug teardown.

The other callers (nohz_balancer_kick() and nohz_balance_enter_idle())
genuinely run with IRQs disabled, so they remain unchanged.

Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com/
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
Changes in v3:
 - Add RCU coverage the offending caller (sched_cpu_deactivate()) (Peter)
 - Link to v2: https://lore.kernel.org/all/20260522051923.1840812-1-arighi@nvidia.com

Changes in v2:
 - Use guard(rcu)() instead of rcu_read_lock/unlock() (Prateek)
 - Link to v1: https://lore.kernel.org/all/20260521205115.1689545-1-arighi@nvidia.com

 kernel/sched/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7fb3f5f2d48c0..b3a416b1c2510 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8681,7 +8681,8 @@ int sched_cpu_deactivate(unsigned int cpu)
 	 * Remove CPU from nohz.idle_cpus_mask to prevent participating in
 	 * load balancing when not active
 	 */
-	nohz_balance_exit_idle(rq);
+	scoped_guard (rcu)
+		nohz_balance_exit_idle(rq);
 
 	set_cpu_active(cpu, false);
 

base-commit: b07a332d9cbbd9cc9cfa923a21bd061cfb69bea5
-- 
2.54.0
Re: [PATCH v3] sched/fair: Fix RCU usage in NOHZ exit path on CPU offline
Posted by Marek Szyprowski 2 days, 6 hours ago
On 22.05.2026 11:25, Andrea Righi wrote:
> Commit c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ
> kick path") removed the rcu_read_lock()/unlock() pair from
> set_cpu_sd_state_busy() and set_cpu_sd_state_idle() on the assumption
> that all callers run in a safe context for rcu_dereference_all(): IRQs
> disabled or cpus_write_lock() held.
>
> That assumption is wrong for the CPU hotplug teardown path. When CPUs
> are taken offline, set_cpu_sd_state_busy() is invoked via:
>
>  cpuhp/N kthread
>    cpuhp_thread_fun()
>      cpuhp_invoke_callback()
>        sched_cpu_deactivate()
>          nohz_balance_exit_idle()
>            set_cpu_sd_state_busy()
>              rcu_dereference_all(per_cpu(sd_llc, cpu))
>
> The cpuhp kthread holds cpu_hotplug_lock (percpu-rwsem) but runs with
> preemption and IRQs enabled. As a result, lockdep correctly reports a
> suspicious RCU usage on CPU offline, e.g.:
>
>   # echo 0 > /sys/devices/system/cpu/cpu1/online
>
>   =============================
>   WARNING: suspicious RCU usage
>   -----------------------------
>   kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
>   ...
>   2 locks held by cpuhp/1/20:
>    #0: (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
>    #1: (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae
>
>   Call Trace:
>     lockdep_rcu_suspicious
>     nohz_balance_exit_idle
>     sched_cpu_deactivate
>     cpuhp_invoke_callback
>     cpuhp_thread_fun
>     smpboot_thread_fn
>
> Fix this by adding RCU read lock coverage to the one caller that lacks
> it: nohz_balance_exit_idle() in the CPU hotplug teardown.
>
> The other callers (nohz_balancer_kick() and nohz_balance_enter_idle())
> genuinely run with IRQs disabled, so they remain unchanged.
>
> Fixes: c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Closes: https://lore.kernel.org/all/38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com/
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: K Prateek Nayak <kprateek.nayak@amd.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>


> ---
> Changes in v3:
>  - Add RCU coverage the offending caller (sched_cpu_deactivate()) (Peter)
>  - Link to v2: https://lore.kernel.org/all/20260522051923.1840812-1-arighi@nvidia.com
>
> Changes in v2:
>  - Use guard(rcu)() instead of rcu_read_lock/unlock() (Prateek)
>  - Link to v1: https://lore.kernel.org/all/20260521205115.1689545-1-arighi@nvidia.com
>
>  kernel/sched/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7fb3f5f2d48c0..b3a416b1c2510 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8681,7 +8681,8 @@ int sched_cpu_deactivate(unsigned int cpu)
>  	 * Remove CPU from nohz.idle_cpus_mask to prevent participating in
>  	 * load balancing when not active
>  	 */
> -	nohz_balance_exit_idle(rq);
> +	scoped_guard (rcu)
> +		nohz_balance_exit_idle(rq);
>  
>  	set_cpu_active(cpu, false);
>  
>
> base-commit: b07a332d9cbbd9cc9cfa923a21bd061cfb69bea5

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland