[PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible

Huacai Chen posted 3 patches 1 year ago
There is a newer version of this series
include/asm-generic/vmlinux.lds.h | 28 ++++++++++++++++++
kernel/bpf/btf.c                  |  7 +++--
kernel/bpf/sysfs_btf.c            |  6 ++--
kernel/kallsyms.c                 |  6 ----
kernel/kallsyms_internal.h        | 30 ++++++++------------
kernel/ksysfs.c                   |  4 +--
lib/buildid.c                     |  4 +--
7 files changed, 52 insertions(+), 33 deletions(-)
[PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Huacai Chen 1 year ago
Backport this series to 6.1&6.6 because LoongArch gets build errors with
latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
error about linking without -fPIC or -fPIE flag in more detail").

  CC      .vmlinux.export.o
  UPD     include/generated/utsversion.h
  CC      init/version-timestamp.o
  LD      .tmp_vmlinux.kallsyms1
loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673

In theory 5.10&5.15 also need this, but since LoongArch get upstream at
5.19, so I just ignore them because there is no error report about other
archs now.

Weak external linkage is intended for cases where a symbol reference
can remain unsatisfied in the final link. Taking the address of such a
symbol should yield NULL if the reference was not satisfied.

Given that ordinary RIP or PC relative references cannot produce NULL,
some kind of indirection is always needed in such cases, and in position
independent code, this results in a GOT entry. In ordinary code, it is
arch specific but amounts to the same thing.

While unavoidable in some cases, weak references are currently also used
to declare symbols that are always defined in the final link, but not in
the first linker pass. This means we end up with worse codegen for no
good reason. So let's clean this up, by providing preliminary
definitions that are only used as a fallback.

Ard Biesheuvel (3):
  kallsyms: Avoid weak references for kallsyms symbols
  vmlinux: Avoid weak reference to notes section
  btf: Avoid weak external references

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 include/asm-generic/vmlinux.lds.h | 28 ++++++++++++++++++
 kernel/bpf/btf.c                  |  7 +++--
 kernel/bpf/sysfs_btf.c            |  6 ++--
 kernel/kallsyms.c                 |  6 ----
 kernel/kallsyms_internal.h        | 30 ++++++++------------
 kernel/ksysfs.c                   |  4 +--
 lib/buildid.c                     |  4 +--
 7 files changed, 52 insertions(+), 33 deletions(-)
---
2.27.0
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Greg Kroah-Hartman 1 year ago
On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote:
> Backport this series to 6.1&6.6 because LoongArch gets build errors with
> latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
> error about linking without -fPIC or -fPIE flag in more detail").
> 
>   CC      .vmlinux.export.o
>   UPD     include/generated/utsversion.h
>   CC      init/version-timestamp.o
>   LD      .tmp_vmlinux.kallsyms1
> loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
> loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
> loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
> loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
> 
> In theory 5.10&5.15 also need this, but since LoongArch get upstream at
> 5.19, so I just ignore them because there is no error report about other
> archs now.

Odd, why doesn't this affect other arches as well using new binutils?  I
hate to have to backport all of this just for one arch, as that feels
odd.

thanks,

greg k-h
Re: Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by WangYuli 10 months, 2 weeks ago
Hi, Greg,

It's rather unfortunate that currently, almost all Linux distributions
supporting LoongArch are using LTS kernels version v6.6 or older, such as
openEuler and deepin. [1][2]

If this bugfix isn't merged into linux-stable, then every single distro
kernel team will have to waste time fixing the same darn bug over and
over, even though it's already fixed in later kernels.

This would really make LTS look like it's failing to serve its intended
purpose. And I'm sure all of us do not want to see something so terrible
happen.

On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman wrote:
> Odd, why doesn't this affect other arches as well using new binutils?  I
> hate to have to backport all of this just for one arch, as that feels
> odd.

Could you help me understand why you expressed that you "hate" to have
to backport something for only one arch?
Given that we've historically done quite a bit of similar backporting for
architectures such as arm, powerpc, and x86...It's not exactly unprecedented.
I just want to grasp the rationale, as it all seems perfectly justified
and necessary.

Moreover, with all the active and strict code reviews by all developers,
such occurrences are not frequent on LoongArch. You could be not exactly
"always" backporting something like this just for LoongArch, so perhaps
that might make you and your colleagues feel a little less "hate" :-)

As for your questions on the root cause of the issue and the effectiveness
of this fix, I reckon Xi Ruoyao's explanation and Ard Biesheuvel's
supplementary points have already provided ample details. [3][4][5]

If, after your feedback, you still have any lingering doubts regarding the
issue itself or the LoongArch architecture, I believe that Xi Ruoyao,
Ard Biesheuvel, and Huacai Chen would all be more than willing to elaborate
further.

I'm bringing this up because we've encountered concrete issues in the
process of maintaining distributions. Furthermore, as an upstream resource,
linux-stable can help us more effectively drive forward community
development efforts.
Plus, we realize this benefits all Linux community developers just the same.

Hoping you could spare a moment from your busy schedule to take another look
at this patch series and perhaps reconsider the LTS inclusion of this bugfix.

[1]. https://gitee.com/openeuler/kernel/blob/openEuler-25.03/Makefile#L3
[2]. https://github.com/deepin-community/kernel/blob/linux-6.6.y/Makefile#L3
[3]. https://lore.kernel.org/all/ccb1fa9034b177042db8fcbe7a95a2a5b466dc30.camel@xry111.site/
[4]. https://lore.kernel.org/all/CAMj1kXEV+HC+2HMLhDaLfAufQLrXRs2J7akMNr1mjejDYc7kdw@mail.gmail.com/#t
[5]. https://lore.kernel.org/all/c9a43e5da01ee2215393c0f3c50956171fe660ab.camel@xry111.site/

Best Regards,
--
WangYuli
Re: Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Greg KH 10 months, 2 weeks ago
On Thu, Feb 06, 2025 at 04:37:02PM +0800, WangYuli wrote:
> Hi, Greg,
> 
> It's rather unfortunate that currently, almost all Linux distributions
> supporting LoongArch are using LTS kernels version v6.6 or older, such as
> openEuler and deepin. [1][2]
> 
> If this bugfix isn't merged into linux-stable, then every single distro
> kernel team will have to waste time fixing the same darn bug over and
> over, even though it's already fixed in later kernels.
> 
> This would really make LTS look like it's failing to serve its intended
> purpose. And I'm sure all of us do not want to see something so terrible
> happen.

LTS is here to ensure that the original release of these branches, keeps
working for that branch.  Adding support for newer toolchains sometimes
happens, but is not a requirement or a normal thing to do as that really
isn't a "regression", right?

Most of the time, fixing things up for newer compilers is simple.
Sometimes it is not simple.  The "not simple" ones we usually just do
not backport as that causes extra work for everyone over time.

As for the distros like openEuler, and deepin, they are free to add
these patches there, on top of their other non-LTS patches, right?

thanks,

greg k-h
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Ming Wang 3 months, 1 week ago
Hi Greg, all,

On 2/6/25 18:03, Greg KH wrote:
> On Thu, Feb 06, 2025 at 04:37:02PM +0800, WangYuli wrote:
>> Hi, Greg,
>>
>> It's rather unfortunate that currently, almost all Linux distributions
>> supporting LoongArch are using LTS kernels version v6.6 or older, such as
>> openEuler and deepin. [1][2]
>>
>> If this bugfix isn't merged into linux-stable, then every single distro
>> kernel team will have to waste time fixing the same darn bug over and
>> over, even though it's already fixed in later kernels.
>>
>> This would really make LTS look like it's failing to serve its intended
>> purpose. And I'm sure all of us do not want to see something so terrible
>> happen.
> 
> LTS is here to ensure that the original release of these branches, keeps
> working for that branch.  Adding support for newer toolchains sometimes
> happens, but is not a requirement or a normal thing to do as that really
> isn't a "regression", right?
> 
> Most of the time, fixing things up for newer compilers is simple.
> Sometimes it is not simple.  The "not simple" ones we usually just do
> not backport as that causes extra work for everyone over time.
> 
> As for the distros like openEuler, and deepin, they are free to add
> these patches there, on top of their other non-LTS patches, right?
> 
> thanks,
> 
> greg k-h

I'm writing to follow up on this important discussion. I have carefully
read the entire thread, including your explanation of the LTS philosophy
regarding support for new toolchains. I understand and respect the
principle that LTS aims to maintain stability for the environment in
which it was released, and that adapting to future toolchains is
primarily a distributor's responsibility.

However, I would like to respectfully ask for a reconsideration by
framing this issue from a slightly different perspective, based on the
excellent technical analysis provided by Xi Ruoyao and Ard Biesheuvel.

This situation appears to be more than just an incompatibility with a
"newer" toolchain. As Xi Ruoyao detailed, the older toolchains did not
"work correctly" but instead had a silent bug that produced incorrect
code for undefined weak symbols on LoongArch. The new binutils version
did not introduce a regression, but rather, it correctly started
erroring out on this problematic code pattern, thus exposing a
pre-existing, latent issue.

 From this viewpoint, this patch series is less about "adding support for
a new toolchain" and more about "fixing a latent bug that was previously
hidden by silent toolchain defects."

Furthermore, the patches themselves, originally authored by Ard, 
represent a clean, correct, and low-risk improvement. They were accepted 
into the mainline not just as a workaround, but as a superior way to 
handle these symbols, improving codegen for all architectures. 
Backporting this series would therefore be applying a high-quality, 
vetted bug fix that also has the fortunate side effect of resolving this 
build failure.

While the build failure currently only manifests on LoongArch, the
underlying code improvement is generic. For a relatively new 
architecture like LoongArch, ensuring that the primary LTS kernels are 
usable with modern, widely-adopted toolchains is crucial for the health 
and growth of its ecosystem within the broader Linux community. As 
WangYuli pointed out, this would prevent fragmented efforts across 
multiple distributions.

In summary, we believe this case is exceptional because the patch fixes
a latent issue exposed by a toolchain correction and represents a clean,
mainline-accepted improvement. We would be very grateful if you could
take another look at this series from this perspective.

Thank you for your time.

Best Regards,
Robin
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Greg KH 3 months, 1 week ago
On Fri, Sep 05, 2025 at 02:49:44PM +0800, Ming Wang wrote:
> Hi Greg, all,
> 
> On 2/6/25 18:03, Greg KH wrote:
> > On Thu, Feb 06, 2025 at 04:37:02PM +0800, WangYuli wrote:
> > > Hi, Greg,
> > > 
> > > It's rather unfortunate that currently, almost all Linux distributions
> > > supporting LoongArch are using LTS kernels version v6.6 or older, such as
> > > openEuler and deepin. [1][2]
> > > 
> > > If this bugfix isn't merged into linux-stable, then every single distro
> > > kernel team will have to waste time fixing the same darn bug over and
> > > over, even though it's already fixed in later kernels.
> > > 
> > > This would really make LTS look like it's failing to serve its intended
> > > purpose. And I'm sure all of us do not want to see something so terrible
> > > happen.
> > 
> > LTS is here to ensure that the original release of these branches, keeps
> > working for that branch.  Adding support for newer toolchains sometimes
> > happens, but is not a requirement or a normal thing to do as that really
> > isn't a "regression", right?
> > 
> > Most of the time, fixing things up for newer compilers is simple.
> > Sometimes it is not simple.  The "not simple" ones we usually just do
> > not backport as that causes extra work for everyone over time.
> > 
> > As for the distros like openEuler, and deepin, they are free to add
> > these patches there, on top of their other non-LTS patches, right?
> > 
> > thanks,
> > 
> > greg k-h
> 
> I'm writing to follow up on this important discussion. I have carefully
> read the entire thread, including your explanation of the LTS philosophy
> regarding support for new toolchains. I understand and respect the
> principle that LTS aims to maintain stability for the environment in
> which it was released, and that adapting to future toolchains is
> primarily a distributor's responsibility.
> 
> However, I would like to respectfully ask for a reconsideration by
> framing this issue from a slightly different perspective, based on the
> excellent technical analysis provided by Xi Ruoyao and Ard Biesheuvel.

<snip>

i'm sorry, but for an email thread that happened 6+ months ago, it's a
bit hard to try to remember anything involved in it.

Heck, I can't remember an email thread from last week.

Remember, some of us get 1000+ emails a day to deal with.

If you feel a patch set should be applied to a stable tree, and it has
been rejected in the past, feel free to resubmit it with all of the new
information about why the previous rejection was wrong and why it really
should be applied this time.  Otherwise, there's really nothing I could
possibly do here as the patches are long gone from everyone's review
queues.

Also, why aren't you just using 6.12.y now?  :)

thanks,

greg k-h
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Ming Wang 3 months, 1 week ago
Hi Greg,

On 9/5/25 15:09, Greg KH wrote:
> On Fri, Sep 05, 2025 at 02:49:44PM +0800, Ming Wang wrote:
>> Hi Greg, all,
>>
>> On 2/6/25 18:03, Greg KH wrote:
>>> On Thu, Feb 06, 2025 at 04:37:02PM +0800, WangYuli wrote:
>>>> Hi, Greg,
>>>>
>>>> It's rather unfortunate that currently, almost all Linux distributions
>>>> supporting LoongArch are using LTS kernels version v6.6 or older, such as
>>>> openEuler and deepin. [1][2]
>>>>
>>>> If this bugfix isn't merged into linux-stable, then every single distro
>>>> kernel team will have to waste time fixing the same darn bug over and
>>>> over, even though it's already fixed in later kernels.
>>>>
>>>> This would really make LTS look like it's failing to serve its intended
>>>> purpose. And I'm sure all of us do not want to see something so terrible
>>>> happen.
>>>
>>> LTS is here to ensure that the original release of these branches, keeps
>>> working for that branch.  Adding support for newer toolchains sometimes
>>> happens, but is not a requirement or a normal thing to do as that really
>>> isn't a "regression", right?
>>>
>>> Most of the time, fixing things up for newer compilers is simple.
>>> Sometimes it is not simple.  The "not simple" ones we usually just do
>>> not backport as that causes extra work for everyone over time.
>>>
>>> As for the distros like openEuler, and deepin, they are free to add
>>> these patches there, on top of their other non-LTS patches, right?
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> I'm writing to follow up on this important discussion. I have carefully
>> read the entire thread, including your explanation of the LTS philosophy
>> regarding support for new toolchains. I understand and respect the
>> principle that LTS aims to maintain stability for the environment in
>> which it was released, and that adapting to future toolchains is
>> primarily a distributor's responsibility.
>>
>> However, I would like to respectfully ask for a reconsideration by
>> framing this issue from a slightly different perspective, based on the
>> excellent technical analysis provided by Xi Ruoyao and Ard Biesheuvel.
> 
> <snip>
> 
> i'm sorry, but for an email thread that happened 6+ months ago, it's a
> bit hard to try to remember anything involved in it.
> 
> Heck, I can't remember an email thread from last week.
> 
> Remember, some of us get 1000+ emails a day to deal with.
> 
> If you feel a patch set should be applied to a stable tree, and it has
> been rejected in the past, feel free to resubmit it with all of the new
> information about why the previous rejection was wrong and why it really
> should be applied this time.  Otherwise, there's really nothing I could
> possibly do here as the patches are long gone from everyone's review
> queues.
> 
> Also, why aren't you just using 6.12.y now?  :)
> 
> thanks,
> 
> greg k-h

Thank you for the quick reply.You've raised a very fair point about 
moving to a newer kernel. Thanks again for your time and the guidance.

Best Regards,
Robin
Re: Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Ard Biesheuvel 10 months, 2 weeks ago
On Thu, 6 Feb 2025 at 09:53, WangYuli <wangyuli@uniontech.com> wrote:
>
> Hi, Greg,
>
> It's rather unfortunate that currently, almost all Linux distributions
> supporting LoongArch are using LTS kernels version v6.6 or older, such as
> openEuler and deepin. [1][2]
>
> If this bugfix isn't merged into linux-stable, then every single distro
> kernel team will have to waste time fixing the same darn bug over and
> over, even though it's already fixed in later kernels.
>
> This would really make LTS look like it's failing to serve its intended
> purpose. And I'm sure all of us do not want to see something so terrible
> happen.
>
> On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman wrote:
> > Odd, why doesn't this affect other arches as well using new binutils?  I
> > hate to have to backport all of this just for one arch, as that feels
> > odd.
>
> Could you help me understand why you expressed that you "hate" to have
> to backport something for only one arch?
> Given that we've historically done quite a bit of similar backporting for
> architectures such as arm, powerpc, and x86...It's not exactly unprecedented.
> I just want to grasp the rationale, as it all seems perfectly justified
> and necessary.
>
> Moreover, with all the active and strict code reviews by all developers,
> such occurrences are not frequent on LoongArch. You could be not exactly
> "always" backporting something like this just for LoongArch, so perhaps
> that might make you and your colleagues feel a little less "hate" :-)
>
> As for your questions on the root cause of the issue and the effectiveness
> of this fix, I reckon Xi Ruoyao's explanation and Ard Biesheuvel's
> supplementary points have already provided ample details. [3][4][5]
>
> If, after your feedback, you still have any lingering doubts regarding the
> issue itself or the LoongArch architecture, I believe that Xi Ruoyao,
> Ard Biesheuvel, and Huacai Chen would all be more than willing to elaborate
> further.
>
> I'm bringing this up because we've encountered concrete issues in the
> process of maintaining distributions. Furthermore, as an upstream resource,
> linux-stable can help us more effectively drive forward community
> development efforts.
> Plus, we realize this benefits all Linux community developers just the same.
>
> Hoping you could spare a moment from your busy schedule to take another look
> at this patch series and perhaps reconsider the LTS inclusion of this bugfix.
>
> [1]. https://gitee.com/openeuler/kernel/blob/openEuler-25.03/Makefile#L3
> [2]. https://github.com/deepin-community/kernel/blob/linux-6.6.y/Makefile#L3
> [3]. https://lore.kernel.org/all/ccb1fa9034b177042db8fcbe7a95a2a5b466dc30.camel@xry111.site/
> [4]. https://lore.kernel.org/all/CAMj1kXEV+HC+2HMLhDaLfAufQLrXRs2J7akMNr1mjejDYc7kdw@mail.gmail.com/#t
> [5]. https://lore.kernel.org/all/c9a43e5da01ee2215393c0f3c50956171fe660ab.camel@xry111.site/
>

You might consider sending a Loongarch-only patch for mainline that
adds weak definitions of these symbols, and backport that to -stable
once it hits Linus's tree. That way, the weak references are always
satisfied, even during the first linker pass.
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Huacai Chen 1 year ago
Hi, Greg,

On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote:
> > Backport this series to 6.1&6.6 because LoongArch gets build errors with
> > latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
> > error about linking without -fPIC or -fPIE flag in more detail").
> >
> >   CC      .vmlinux.export.o
> >   UPD     include/generated/utsversion.h
> >   CC      init/version-timestamp.o
> >   LD      .tmp_vmlinux.kallsyms1
> > loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
> > loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
> > loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
> > loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
> >
> > In theory 5.10&5.15 also need this, but since LoongArch get upstream at
> > 5.19, so I just ignore them because there is no error report about other
> > archs now.
>
> Odd, why doesn't this affect other arches as well using new binutils?  I
> hate to have to backport all of this just for one arch, as that feels
> odd.
The related binutils commit is only for LoongArch, so build errors
only occured on LoongArch. I don't know why other archs have no
problem exactly, but may be related to their CFLAGS (for example, if
we disable CONFIG_RELOCATABLE, LoongArch also has no build errors
because CFLAGS changes).

On the other hand, Ard's original patches are not for LoongArch only,
so I think backport to stable branches is also not for LoongArch only.

Huacai

>
> thanks,
>
> greg k-h
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Greg Kroah-Hartman 1 year ago
On Sat, Dec 07, 2024 at 05:21:00PM +0800, Huacai Chen wrote:
> Hi, Greg,
> 
> On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote:
> > > Backport this series to 6.1&6.6 because LoongArch gets build errors with
> > > latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
> > > error about linking without -fPIC or -fPIE flag in more detail").
> > >
> > >   CC      .vmlinux.export.o
> > >   UPD     include/generated/utsversion.h
> > >   CC      init/version-timestamp.o
> > >   LD      .tmp_vmlinux.kallsyms1
> > > loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
> > > loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
> > > loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
> > > loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
> > >
> > > In theory 5.10&5.15 also need this, but since LoongArch get upstream at
> > > 5.19, so I just ignore them because there is no error report about other
> > > archs now.
> >
> > Odd, why doesn't this affect other arches as well using new binutils?  I
> > hate to have to backport all of this just for one arch, as that feels
> > odd.
> The related binutils commit is only for LoongArch, so build errors
> only occured on LoongArch. I don't know why other archs have no
> problem exactly, but may be related to their CFLAGS (for example, if
> we disable CONFIG_RELOCATABLE, LoongArch also has no build errors
> because CFLAGS changes).

does LoongArch depend on that option?  What happens if it is enabled for
other arches?  Why doesn't it break them?

> On the other hand, Ard's original patches are not for LoongArch only,
> so I think backport to stable branches is also not for LoongArch only.

Maybe Ard can answer that.

thanks,

greg k-h
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Xi Ruoyao 1 year ago
On Sat, 2024-12-07 at 10:32 +0100, Greg Kroah-Hartman wrote:
> On Sat, Dec 07, 2024 at 05:21:00PM +0800, Huacai Chen wrote:
> > Hi, Greg,
> > 
> > On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > > 
> > > On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote:
> > > > Backport this series to 6.1&6.6 because LoongArch gets build errors with
> > > > latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
> > > > error about linking without -fPIC or -fPIE flag in more detail").
> > > > 
> > > >   CC      .vmlinux.export.o
> > > >   UPD     include/generated/utsversion.h
> > > >   CC      init/version-timestamp.o
> > > >   LD      .tmp_vmlinux.kallsyms1
> > > > loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
> > > > loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
> > > > loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
> > > > loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
> > > > 
> > > > In theory 5.10&5.15 also need this, but since LoongArch get upstream at
> > > > 5.19, so I just ignore them because there is no error report about other
> > > > archs now.
> > > 
> > > Odd, why doesn't this affect other arches as well using new binutils?  I
> > > hate to have to backport all of this just for one arch, as that feels
> > > odd.
> > The related binutils commit is only for LoongArch, so build errors
> > only occured on LoongArch. I don't know why other archs have no
> > problem exactly, but may be related to their CFLAGS (for example, if
> > we disable CONFIG_RELOCATABLE, LoongArch also has no build errors
> > because CFLAGS changes).
> 
> does LoongArch depend on that option?

"That option" is -mdirect-extern-access.  Without it we'll use GOT in
the kernel image to address anything out of the current TU, bloating the
kernel size and making it slower.

The problem is the linker failed to handle a direct access to undefined
weak symbol on LoongArch.  With GCC 14.2 and Binutils 2.43:

$ cat t.c
extern int x __attribute__ ((weak));

int main()
{
	__builtin_printf("%p\n", &x);
}
$ cc t.c -mdirect-extern-access -static-pie -fPIE
$ ./a.out
0x7ffff27ac000

The output should be (nil) instead, as an undefined weak symbol should
be resolved to address 0.  I'm not sure why the kernel was not blown up
by this issue.

With Binutils trunk, an error is emitted instead of silently producing
buggy executable.  Still I don't think emitting an error is correct when
linking a static PIE (our vmlinux is a static PIE).  Instead the linker
should just rewrite

    pcalau12i rd, %pc_hi20(undef_weak)

to

    move rd, $zero

Also the "recompile with -fPIE" suggestion in the error message is
completely misleading.  We are *already* compiling relocatable kernel
with -fPIE.

I'm making some Binutils patches to implement the rewrite and reword the
error message (for instances where emitting an error is the correct
thing, e.g. someone attempts to build a dynamically linked program with
-mdirect-extern-access).

> What happens if it is enabled for other arches?  Why doesn't it break
> them?

The other arches have copy relocation, so their -mdirect-extern-access
is intended to work with dynamically linked executable, thus it's the
default and not as strong as ours.  On them -mdirect-extern-access still
uses GOT to address weak symbols.

We don't have copy relocation, thus our default is -mno-direct-extern-
access, and -mdirect-extern-access is only intended for static
executables (including OS kernel, embedded firmware, etc).  So it's
designed to be stronger, unfortunately the toolchain failed to implement
it correctly.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Ard Biesheuvel 1 year ago
On Sat, 7 Dec 2024 at 11:46, Xi Ruoyao <xry111@xry111.site> wrote:
>
> On Sat, 2024-12-07 at 10:32 +0100, Greg Kroah-Hartman wrote:
> > On Sat, Dec 07, 2024 at 05:21:00PM +0800, Huacai Chen wrote:
> > > Hi, Greg,
> > >
> > > On Fri, Dec 6, 2024 at 9:04 PM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Fri, Dec 06, 2024 at 04:58:07PM +0800, Huacai Chen wrote:
> > > > > Backport this series to 6.1&6.6 because LoongArch gets build errors with
> > > > > latest binutils which has commit 599df6e2db17d1c4 ("ld, LoongArch: print
> > > > > error about linking without -fPIC or -fPIE flag in more detail").
> > > > >
> > > > >   CC      .vmlinux.export.o
> > > > >   UPD     include/generated/utsversion.h
> > > > >   CC      init/version-timestamp.o
> > > > >   LD      .tmp_vmlinux.kallsyms1
> > > > > loongarch64-unknown-linux-gnu-ld: kernel/kallsyms.o:(.text+0): relocation R_LARCH_PCALA_HI20 against `kallsyms_markers` can not be used when making a PIE object; recompile with -fPIE
> > > > > loongarch64-unknown-linux-gnu-ld: kernel/crash_core.o:(.init.text+0x984): relocation R_LARCH_PCALA_HI20 against `kallsyms_names` can not be used when making a PIE object; recompile with -fPIE
> > > > > loongarch64-unknown-linux-gnu-ld: kernel/bpf/btf.o:(.text+0xcc7c): relocation R_LARCH_PCALA_HI20 against `__start_BTF` can not be used when making a PIE object; recompile with -fPIE
> > > > > loongarch64-unknown-linux-gnu-ld: BFD (GNU Binutils) 2.43.50.20241126 assertion fail ../../bfd/elfnn-loongarch.c:2673
> > > > >
> > > > > In theory 5.10&5.15 also need this, but since LoongArch get upstream at
> > > > > 5.19, so I just ignore them because there is no error report about other
> > > > > archs now.
> > > >
> > > > Odd, why doesn't this affect other arches as well using new binutils?  I
> > > > hate to have to backport all of this just for one arch, as that feels
> > > > odd.
> > > The related binutils commit is only for LoongArch, so build errors
> > > only occured on LoongArch. I don't know why other archs have no
> > > problem exactly, but may be related to their CFLAGS (for example, if
> > > we disable CONFIG_RELOCATABLE, LoongArch also has no build errors
> > > because CFLAGS changes).
> >
> > does LoongArch depend on that option?
>
> "That option" is -mdirect-extern-access.  Without it we'll use GOT in
> the kernel image to address anything out of the current TU, bloating the
> kernel size and making it slower.
>

An alternative to this might be to add

-include $(srctree)/include/linux/hidden.h

to KBUILD_CFLAGS_KERNEL, so that the compiler understands that all
external references are resolved at link time, not at load/run time.

> The problem is the linker failed to handle a direct access to undefined
> weak symbol on LoongArch.
...
> With Binutils trunk, an error is emitted instead of silently producing
> buggy executable.  Still I don't think emitting an error is correct when
> linking a static PIE (our vmlinux is a static PIE).  Instead the linker
> should just rewrite
>
>     pcalau12i rd, %pc_hi20(undef_weak)
>
> to
>
>     move rd, $zero
>

Is that transformation even possible at link time? Isn't pc_hi20 part of a pair?

> Also the "recompile with -fPIE" suggestion in the error message is
> completely misleading.  We are *already* compiling relocatable kernel
> with -fPIE.
>

And this is the most important difference between LoongArch and the
other arches - LoongArch already uses PIC code explicitly. Other
architectures use ordinary position dependent codegen and linking, or
-in the case of arm64- use position dependent codegen and PIE linking,
where the fact that this is even possible is a happy accident.

...
> > What happens if it is enabled for other arches?  Why doesn't it break
> > them?
>
> The other arches have copy relocation, so their -mdirect-extern-access
> is intended to work with dynamically linked executable, thus it's the
> default and not as strong as ours.  On them -mdirect-extern-access still
> uses GOT to address weak symbols.
>
> We don't have copy relocation, thus our default is -mno-direct-extern-
> access, and -mdirect-extern-access is only intended for static
> executables (including OS kernel, embedded firmware, etc).  So it's
> designed to be stronger, unfortunately the toolchain failed to implement
> it correctly.
>

This has nothing to do with copy relocations - those are only relevant
when shared libraries come into play.

Other architectures don't break because they either a) use position
dependent codegen with absolute addressing, and simply resolve
undefined weak references as 0x0, or b) use GOT indirection, where the
reference is a GOT load and the address in the GOT is set to 0x0.

So the issue here appears to be that the compiler fails to emit a GOT
entry for this reference, even though it is performing PIC codegen.
This is probably due to -mdirect-extern-access being taken into
account too strictly. The upshot is that a relative reference is
emitted to an undefined symbol, and it is impossible for a relative
reference to [reliably] yield NULL, and so the reference produces a
bogus non-NULL address.

As these patches deal with symbols that are only undefined in the
preliminary first linker pass, and are guaranteed to exist afterwards,
silently emitting a bogus relative reference was not a problem in
these cases. Obviously, throwing an error is.

The patches should be rather harmless in practice, but I know Masahiro
did not like the approach for the kallsyms markers, and made some
subsequent modifications to it.

Given that this is relatively new toolchain behavior, I'd suggest
fixing the compiler to emit weak external references via GOT entries
even when  -mdirect-extern-access is in effect.
Re: [PATCH 6.1&6.6 0/3] kbuild: Avoid weak external linkage where possible
Posted by Xi Ruoyao 1 year ago
On Mon, 2024-12-09 at 09:31 +0100, Ard Biesheuvel wrote:
> Given that this is relatively new toolchain behavior, I'd suggest
> fixing the compiler to emit weak external references via GOT entries
> even when  -mdirect-extern-access is in effect.

I'm working on an approach in the linker instead.  A PC-relative address
in +/- 2GiB range is

pcalau12i.d $a0, %pc_hi20(sym + addend)
addi.d $a0, $a0, %pc_lo12(sym + addend)

If doing a static linking, when sym is weak undefined, we should just
load addend.  The compiler already guarantees addend is in [-2**31,
2**31) range, so we just need to rewrite the pair to

lu12i.w $a0, ((addend + 0x800) & ~0x7ff)
addi.d $a0, $a0, (addend & 0x7ff)

OTOH if not doing a static linking, the user shouldn't use -mdirect-
extern-access at all [this rule is the thing related to copy relocation:
if copy relocation was available it would be possibly valid to use -
mdirect-extern-access w/o static linking] and the linker is correct to
report an error (but the error message is unclear and I need to fix it
anyway).

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University