[PATCH v5 0/3] sched/fair: Improve nohz fields for large systems

Shrikanth Hegde posted 3 patches 3 weeks, 2 days ago
kernel/sched/fair.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
[PATCH v5 0/3] sched/fair: Improve nohz fields for large systems
Posted by Shrikanth Hegde 3 weeks, 2 days ago
Running on large systems nohz.nr_cpus cacheline was seen as contended.
There is atomic inc/dec and read happening on many
CPUs at a time and it is possible for this line to bounce often.

1st and 2nd patch are minor ones. Looks like correct things to do.
Not very important ones.

3rd patch: Main patch which is to get rid of nr_cpus.Instead, use the cpumask
which is always updated alongside with it. Functionally it should serve
the same purpose. Rest of the fields aren't updated that often. So this
line shouldn't bounce that often.

Contention issue with nohz.idle_cpus_mask still remains. Mostly it is in
separate cacheline than nohz. There are ongoing efforts to mitigate it. It
is not addressed by this series.

v4 -> v5:
- Collected tags (Thanks to K Prateek Nayak, Valentin Schneider)
- Added comment for patch 1, making note of a narrow window where
  kick_ilb will be called un-necessarily. (Vincent Guittot) 

v3 -> v4:
- Added to changelog on one less cacheline being dirtied on idle
  entry/exit (Valentin Schneider)

v2 -> v3:
- Converted out to return when there are no CPU is in tickless mode
  since find_ilb_cpu returns anyway (K Prateek Nayak)

v1 -> v2:
- Dropped patch to check has_blocked based on time.
- Detailed changelog for removing nr_cpus (Thanks to Ingo Molnar)

v1: https://lore.kernel.org/all/20251201183146.74443-1-sshegde@linux.ibm.com/
v2: https://lore.kernel.org/all/20260102124744.360872-1-sshegde@linux.ibm.com/
v3: https://lore.kernel.org/all/20260107065125.669668-1-sshegde@linux.ibm.com/
v4: https://lore.kernel.org/all/20260112050442.138446-1-sshegde@linux.ibm.com/

Shrikanth Hegde (3):
  sched/fair: Move checking for nohz cpus after time check
  sched/fair: Change likelyhood of nohz.nr_cpus
  sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead

 kernel/sched/fair.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

-- 
2.51.0
Re: [PATCH v5 0/3] sched/fair: Improve nohz fields for large systems
Posted by Vincent Guittot 3 weeks, 2 days ago
On Thu, 15 Jan 2026 at 08:36, Shrikanth Hegde <sshegde@linux.ibm.com> wrote:
>
> Running on large systems nohz.nr_cpus cacheline was seen as contended.
> There is atomic inc/dec and read happening on many
> CPUs at a time and it is possible for this line to bounce often.
>
> 1st and 2nd patch are minor ones. Looks like correct things to do.
> Not very important ones.
>
> 3rd patch: Main patch which is to get rid of nr_cpus.Instead, use the cpumask
> which is always updated alongside with it. Functionally it should serve
> the same purpose. Rest of the fields aren't updated that often. So this
> line shouldn't bounce that often.
>
> Contention issue with nohz.idle_cpus_mask still remains. Mostly it is in
> separate cacheline than nohz. There are ongoing efforts to mitigate it. It
> is not addressed by this series.
>
> v4 -> v5:
> - Collected tags (Thanks to K Prateek Nayak, Valentin Schneider)
> - Added comment for patch 1, making note of a narrow window where
>   kick_ilb will be called un-necessarily. (Vincent Guittot)

Looks good to me
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

>
> v3 -> v4:
> - Added to changelog on one less cacheline being dirtied on idle
>   entry/exit (Valentin Schneider)
>
> v2 -> v3:
> - Converted out to return when there are no CPU is in tickless mode
>   since find_ilb_cpu returns anyway (K Prateek Nayak)
>
> v1 -> v2:
> - Dropped patch to check has_blocked based on time.
> - Detailed changelog for removing nr_cpus (Thanks to Ingo Molnar)
>
> v1: https://lore.kernel.org/all/20251201183146.74443-1-sshegde@linux.ibm.com/
> v2: https://lore.kernel.org/all/20260102124744.360872-1-sshegde@linux.ibm.com/
> v3: https://lore.kernel.org/all/20260107065125.669668-1-sshegde@linux.ibm.com/
> v4: https://lore.kernel.org/all/20260112050442.138446-1-sshegde@linux.ibm.com/
>
> Shrikanth Hegde (3):
>   sched/fair: Move checking for nohz cpus after time check
>   sched/fair: Change likelyhood of nohz.nr_cpus
>   sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead
>
>  kernel/sched/fair.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
>
> --
> 2.51.0
>