[PATCH v2 0/3] mm: Restrict the static definition of the per-CPU variable _shared_alloc_tag to s390 and alpha architectures only

Hao Ge posted 3 patches 3 months, 4 weeks ago
There is a newer version of this series
arch/alpha/Kconfig              | 1 +
arch/alpha/include/asm/percpu.h | 2 +-
arch/s390/Kconfig               | 1 +
arch/s390/include/asm/percpu.h  | 2 +-
include/linux/alloc_tag.h       | 6 +++---
include/linux/percpu-defs.h     | 4 ++--
lib/alloc_tag.c                 | 2 ++
mm/Kconfig                      | 4 ++++
8 files changed, 15 insertions(+), 7 deletions(-)
[PATCH v2 0/3] mm: Restrict the static definition of the per-CPU variable _shared_alloc_tag to s390 and alpha architectures only
Posted by Hao Ge 3 months, 4 weeks ago
From: Hao Ge <gehao@kylinos.cn>

Recently discovered this entry while checking kallsyms on ARM64:
ffff800083e509c0 D _shared_alloc_tag

If ARCH_NEEDS_WEAK_PER_CPU is not defined((it is only defined for
s390 and alpha architectures),there's no need to statically define
the percpu variable _shared_alloc_tag. As the number of CPUs
increases,the wasted memory will grow correspondingly.

Therefore,we need to implement isolation for this purpose.

However,currently ARCH_NEEDS_WEAK_PER_CPU is a #define and
is enclosed within the #if defined(MODULE) conditional block.

When building the core kernel code for s390 or alpha architectures,
ARCH_NEEDS_WEAK_PER_CPU remains undefined (as it is gated
by #if defined(MODULE)). However,when building modules for these
architectures,the macro is explicitly defined.

Therefore,we need to make ARCH_NEEDS_WEAK_PER_CPU a Kconfig option.
And replace all instances of ARCH_NEEDS_WEAK_PER_CPU in the kernel
code with MODULE_NEEDS_WEAK_PER_CPU,MODULE_NEEDS_WEAK_PER_CPU might
be a more accurate description,because it was only needed for modules.
Then,when defining the percpu variable _shared_alloc_tag,wrap it with the
CONFIG_ARCH_NEEDS_WEAK_PER_CPU condition.

The following version can be regarded as the most original version:
https://lore.kernel.org/all/20250529073537.563107-1-hao.ge@linux.dev/
But unfortunately,it caused build errors on s390.
Based on Suren's guidance and suggestions,
I've refined it into this patch series.
Many thanks to Suren for his patient instruction.

Verify:
     1. On Arm64:
        nm vmlinux | grep "_shared_alloc_tag",no output is returned.
     2. On S390:
        Compile tested.
        nm vmlinux | grep "_shared_alloc_tag"
        00000000015605b4 r __crc__shared_alloc_tag
        0000000001585fef r __kstrtab__shared_alloc_tag
        0000000001586897 r __kstrtabns__shared_alloc_tag
        00000000014f6548 r __ksymtab__shared_alloc_tag
        0000000001a8fa28 D _shared_alloc_tag
        nm net/ceph/libceph.ko | grep "_shared"
        U _shared_alloc_tag
     3. On alpha
        Compile tested.
        nm vmlinux | grep "_shared_alloc_tag"
        fffffc0000b080fa r __kstrtab__shared_alloc_tag
        fffffc0000b07ee7 r __kstrtabns__shared_alloc_tag
        fffffc0000adee98 r __ksymtab__shared_alloc_tag
        fffffc0000b83d38 D _shared_alloc_tag
        nm crypto/cryptomgr.ko | grep "_share"
        U _shared_alloc_tag

v2:
    Heiko pointed out that when defining MODULE_NEEDS_WEAK_PER_CPU,
    the CONFIG_ARCH_NEEDS_WEAK_PER_CPU condition in the v1 version
    should be removed,as it is always true for s390 and alpha
    architectures.And He also pointed out that patches 2-4 need to
    be merged into one patch. Modify the code according to the suggestions
    and update the corresponding commit message.

Hao Ge (3):
  mm/Kconfig: add ARCH_NEEDS_WEAK_PER_CPU Option and enable it for
    s390/alpha
  mm: replace ARCH_NEEDS_WEAK_PER_CPU with MODULE_NEEDS_WEAK_PER_CPU
  mm/alloc_tag: add the CONFIG_ARCH_NEEDS_WEAK_PER_CPU macro when
    statically defining the percpu variable _shared_alloc_tag

 arch/alpha/Kconfig              | 1 +
 arch/alpha/include/asm/percpu.h | 2 +-
 arch/s390/Kconfig               | 1 +
 arch/s390/include/asm/percpu.h  | 2 +-
 include/linux/alloc_tag.h       | 6 +++---
 include/linux/percpu-defs.h     | 4 ++--
 lib/alloc_tag.c                 | 2 ++
 mm/Kconfig                      | 4 ++++
 8 files changed, 15 insertions(+), 7 deletions(-)

-- 
2.25.1
Re: [PATCH v2 0/3] mm: Restrict the static definition of the per-CPU variable _shared_alloc_tag to s390 and alpha architectures only
Posted by Suren Baghdasaryan 3 months, 4 weeks ago
On Thu, Jun 12, 2025 at 8:06 PM Hao Ge <hao.ge@linux.dev> wrote:
>
> From: Hao Ge <gehao@kylinos.cn>
>
> Recently discovered this entry while checking kallsyms on ARM64:
> ffff800083e509c0 D _shared_alloc_tag
>
> If ARCH_NEEDS_WEAK_PER_CPU is not defined((it is only defined for
> s390 and alpha architectures),there's no need to statically define
> the percpu variable _shared_alloc_tag. As the number of CPUs
> increases,the wasted memory will grow correspondingly.
>
> Therefore,we need to implement isolation for this purpose.
>
> However,currently ARCH_NEEDS_WEAK_PER_CPU is a #define and
> is enclosed within the #if defined(MODULE) conditional block.
>
> When building the core kernel code for s390 or alpha architectures,
> ARCH_NEEDS_WEAK_PER_CPU remains undefined (as it is gated
> by #if defined(MODULE)). However,when building modules for these
> architectures,the macro is explicitly defined.
>
> Therefore,we need to make ARCH_NEEDS_WEAK_PER_CPU a Kconfig option.
> And replace all instances of ARCH_NEEDS_WEAK_PER_CPU in the kernel
> code with MODULE_NEEDS_WEAK_PER_CPU,MODULE_NEEDS_WEAK_PER_CPU might
> be a more accurate description,because it was only needed for modules.
> Then,when defining the percpu variable _shared_alloc_tag,wrap it with the
> CONFIG_ARCH_NEEDS_WEAK_PER_CPU condition.
>
> The following version can be regarded as the most original version:
> https://lore.kernel.org/all/20250529073537.563107-1-hao.ge@linux.dev/
> But unfortunately,it caused build errors on s390.
> Based on Suren's guidance and suggestions,
> I've refined it into this patch series.
> Many thanks to Suren for his patient instruction.

I think the first two patches in your patchset should be merged together.

>
> Verify:
>      1. On Arm64:
>         nm vmlinux | grep "_shared_alloc_tag",no output is returned.
>      2. On S390:
>         Compile tested.
>         nm vmlinux | grep "_shared_alloc_tag"
>         00000000015605b4 r __crc__shared_alloc_tag
>         0000000001585fef r __kstrtab__shared_alloc_tag
>         0000000001586897 r __kstrtabns__shared_alloc_tag
>         00000000014f6548 r __ksymtab__shared_alloc_tag
>         0000000001a8fa28 D _shared_alloc_tag
>         nm net/ceph/libceph.ko | grep "_shared"
>         U _shared_alloc_tag
>      3. On alpha
>         Compile tested.
>         nm vmlinux | grep "_shared_alloc_tag"
>         fffffc0000b080fa r __kstrtab__shared_alloc_tag
>         fffffc0000b07ee7 r __kstrtabns__shared_alloc_tag
>         fffffc0000adee98 r __ksymtab__shared_alloc_tag
>         fffffc0000b83d38 D _shared_alloc_tag
>         nm crypto/cryptomgr.ko | grep "_share"
>         U _shared_alloc_tag
>
> v2:
>     Heiko pointed out that when defining MODULE_NEEDS_WEAK_PER_CPU,
>     the CONFIG_ARCH_NEEDS_WEAK_PER_CPU condition in the v1 version
>     should be removed,as it is always true for s390 and alpha
>     architectures.And He also pointed out that patches 2-4 need to
>     be merged into one patch. Modify the code according to the suggestions
>     and update the corresponding commit message.
>
> Hao Ge (3):
>   mm/Kconfig: add ARCH_NEEDS_WEAK_PER_CPU Option and enable it for
>     s390/alpha
>   mm: replace ARCH_NEEDS_WEAK_PER_CPU with MODULE_NEEDS_WEAK_PER_CPU
>   mm/alloc_tag: add the CONFIG_ARCH_NEEDS_WEAK_PER_CPU macro when
>     statically defining the percpu variable _shared_alloc_tag
>
>  arch/alpha/Kconfig              | 1 +
>  arch/alpha/include/asm/percpu.h | 2 +-
>  arch/s390/Kconfig               | 1 +
>  arch/s390/include/asm/percpu.h  | 2 +-
>  include/linux/alloc_tag.h       | 6 +++---
>  include/linux/percpu-defs.h     | 4 ++--
>  lib/alloc_tag.c                 | 2 ++
>  mm/Kconfig                      | 4 ++++
>  8 files changed, 15 insertions(+), 7 deletions(-)
>
> --
> 2.25.1
>