[PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess

Yury Norov posted 5 patches 3 years, 7 months ago
arch/loongarch/kernel/setup.c |  2 +-
arch/mips/kernel/setup.c      |  2 +-
arch/x86/kernel/smpboot.c     |  4 ++--
arch/x86/xen/smp_pv.c         |  2 +-
include/linux/cpumask.h       | 22 +++++++++++-----------
kernel/smp.c                  |  6 ++++--
lib/Kconfig                   |  9 +++++++++
7 files changed, 29 insertions(+), 18 deletions(-)
[PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Yury Norov 3 years, 7 months ago
cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
despite that the variables have different meaning and purpose. It makes
some cpumask functions broken.

This series cleans that mess and adds new config FORCE_NR_CPUS that
allows to optimize cpumask subsystem if the number of CPUs is known
at compile-time.

After some testing I found build broken when SMP is on and NR_CPUS == 1.
This is addressed in a new patch #1, and in the following patch #2 that
now declares set_nr_cpu_ids in cpumask.h (was in smp.h).

v1: https://lore.kernel.org/lkml/20220829165748.1008477-1-yury.norov@gmail.com/T/#mecbd787f8d1bff1454a4ec2fe46ad6dc168df695
v2:
 - don't declare nr_cpu_ids if NR_CPUS == 1;
 - move set_nr_cpu_ids() from smp.h to cpumask.h to avoid errors in some
   inclusion paths;
 - drop 'default n' in FORCE_NR_CPUS option description;
 - rebase on top of v6.0-rc4.

Yury Norov (5):
  smp: don't declare nr_cpu_ids if NR_CPUS == 1
  smp: add set_nr_cpu_ids()
  lib/cpumask: delete misleading comment
  lib/cpumask: deprecate nr_cpumask_bits
  lib/cpumask: add FORCE_NR_CPUS config option

 arch/loongarch/kernel/setup.c |  2 +-
 arch/mips/kernel/setup.c      |  2 +-
 arch/x86/kernel/smpboot.c     |  4 ++--
 arch/x86/xen/smp_pv.c         |  2 +-
 include/linux/cpumask.h       | 22 +++++++++++-----------
 kernel/smp.c                  |  6 ++++--
 lib/Kconfig                   |  9 +++++++++
 7 files changed, 29 insertions(+), 18 deletions(-)

-- 
2.34.1
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Yury Norov 3 years, 6 months ago
On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> despite that the variables have different meaning and purpose. It makes
> some cpumask functions broken.
> 
> This series cleans that mess and adds new config FORCE_NR_CPUS that
> allows to optimize cpumask subsystem if the number of CPUs is known
> at compile-time.
> 
> After some testing I found build broken when SMP is on and NR_CPUS == 1.
> This is addressed in a new patch #1, and in the following patch #2 that
> now declares set_nr_cpu_ids in cpumask.h (was in smp.h).
> 
> v1: https://lore.kernel.org/lkml/20220829165748.1008477-1-yury.norov@gmail.com/T/#mecbd787f8d1bff1454a4ec2fe46ad6dc168df695
> v2:
>  - don't declare nr_cpu_ids if NR_CPUS == 1;
>  - move set_nr_cpu_ids() from smp.h to cpumask.h to avoid errors in some
>    inclusion paths;
>  - drop 'default n' in FORCE_NR_CPUS option description;
>  - rebase on top of v6.0-rc4.

Any more comments? If not, I'd like to move it into bitmap-for-next.
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Peter Zijlstra 3 years, 7 months ago
On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> despite that the variables have different meaning and purpose. It makes
> some cpumask functions broken.
> 
> This series cleans that mess and adds new config FORCE_NR_CPUS that
> allows to optimize cpumask subsystem if the number of CPUs is known
> at compile-time.

Who will use this? Distro's can't, which means 99% of people will not
use this ever. Is it worth it?
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Yury Norov 3 years, 7 months ago
On Tue, Sep 06, 2022 at 10:55:20AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> > cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> > despite that the variables have different meaning and purpose. It makes
> > some cpumask functions broken.
> > 
> > This series cleans that mess and adds new config FORCE_NR_CPUS that
> > allows to optimize cpumask subsystem if the number of CPUs is known
> > at compile-time.
> 
> Who will use this? Distro's can't,

Raspberry PI can. Smartphone vendors can, and probably should. Ditto
for vehicles and all embedded applications.

> which means 99% of people will not
> use this ever. Is it worth it?

I will definitely enable it for myself. Actually anyone who set NR_CPUS
to a non-default value, may also be interested in FORCE_NR_CPUS.
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Valentin Schneider 3 years, 7 months ago
On 06/09/22 10:55, Peter Zijlstra wrote:
> On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
>> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
>> despite that the variables have different meaning and purpose. It makes
>> some cpumask functions broken.
>>
>> This series cleans that mess and adds new config FORCE_NR_CPUS that
>> allows to optimize cpumask subsystem if the number of CPUs is known
>> at compile-time.
>
> Who will use this? Distro's can't, which means 99% of people will not
> use this ever. Is it worth it?

I'd tend to agree here.

One extra thing worth noting is CONFIG_CPUMASK_OFFSTACK=n cpumask_size()
still uses NR_CPUS under the hood, despite being (mostly) used to
dynamically allocate cpumasks. So having an unconditionnal

  #define nr_cpumask_bits nr_cpu_ids

would save up some memory for those allocations.

A quick compile test on x86 defconfig (OFFSTACK=n) gives me:

  Total: Before=18711411, After=18705653, chg -0.03%

If it's in the range of barely-half-a-page on other archs, could we just
do that then?
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Yury Norov 3 years, 7 months ago
On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:
> On 06/09/22 10:55, Peter Zijlstra wrote:
> > On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> >> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> >> despite that the variables have different meaning and purpose. It makes
> >> some cpumask functions broken.
> >>
> >> This series cleans that mess and adds new config FORCE_NR_CPUS that
> >> allows to optimize cpumask subsystem if the number of CPUs is known
> >> at compile-time.
> >
> > Who will use this? Distro's can't, which means 99% of people will not
> > use this ever. Is it worth it?
> 
> I'd tend to agree here.
> 
> One extra thing worth noting is CONFIG_CPUMASK_OFFSTACK=n cpumask_size()
> still uses NR_CPUS under the hood, despite being (mostly) used to
> dynamically allocate cpumasks. So having an unconditionnal
> 
>   #define nr_cpumask_bits nr_cpu_ids
> 
> would save up some memory for those allocations.

Thanks, I didn't mention this. This is exactly what I meant by
'cleaning the mess'.
 
> A quick compile test on x86 defconfig (OFFSTACK=n) gives me:
> 
>   Total: Before=18711411, After=18705653, chg -0.03%
 
All cpumask_size() allocations are runtime, right?

> If it's in the range of barely-half-a-page on other archs, could we just
> do that then?

How many is that in terms of I-cache lines?
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Peter Zijlstra 3 years, 7 months ago
On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:

>   #define nr_cpumask_bits nr_cpu_ids

That assumes the CPU space is dense; is this so? That is, you can have 4
CPUs while the highest cpu number is vastly larger than 4.

It's uncommon, but not unheard of I think. ISTR some BIOSes leaving
holes in the CPU space when there were empty CPU sockets on the
motherboard.
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Valentin Schneider 3 years, 7 months ago
On 06/09/22 16:38, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:
>
>>   #define nr_cpumask_bits nr_cpu_ids
>
> That assumes the CPU space is dense; is this so? That is, you can have 4
> CPUs while the highest cpu number is vastly larger than 4.
>
> It's uncommon, but not unheard of I think. ISTR some BIOSes leaving
> holes in the CPU space when there were empty CPU sockets on the
> motherboard.

I'd assume this would be visible in the cpu_possible_mask and thus be
properly reflected in nr_cpu_ids. Otherwise that would already break with
CONFIG_CPUMASK_OFFSTACK=y, I think.
Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
Posted by Yury Norov 3 years, 7 months ago
On Tue, Sep 6, 2022 at 8:45 AM Valentin Schneider <vschneid@redhat.com> wrote:
>
> On 06/09/22 16:38, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:
> >
> >>   #define nr_cpumask_bits nr_cpu_ids
> >
> > That assumes the CPU space is dense; is this so? That is, you can have 4
> > CPUs while the highest cpu number is vastly larger than 4.

It's quite common. One can configure qemu to give to the user one cpu at
start, and hotplug more on demand. In that case those unattached CPUS
are set in cpu_possible_mask.

> > It's uncommon, but not unheard of I think. ISTR some BIOSes leaving
> > holes in the CPU space when there were empty CPU sockets on the
> > motherboard.

Didn't get my hands on that particular board, but I suspect that those missed
CPUs will be set in possible mask, and unset in present, online and active
masks.

> I'd assume this would be visible in the cpu_possible_mask and thus be
> properly reflected in nr_cpu_ids. Otherwise that would already break with
> CONFIG_CPUMASK_OFFSTACK=y, I think.

Yes.

The nr_cpu_ids is set as:
  find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1

For all board and VM configurations I've been working with, the
cpu_possible_mask
was dense up to nr_cpu_ids.The holes that may appear if HOTPLUG is enabled
or by any other reason are all in the present mask, and therefore
nr_cpu_ids is set
correctly.