LoongArch: initial 32-bit UAPI

[PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Jiaxun Yang 1 year, 1 month ago

This series defines the UAPI for LoongArch32, marking my initial step
towards upstreaming support for the architecture. Once the UAPI is
ratified, we can proceed to scrutinise various kernel components to
enable 32-bit support while simultaneously addressing user-space porting.

Why am I upstreaming LoongArch32?
================================
Although 32-bit systems are experiencing declining adoption in general
computing, LoongArch32 remains highly relevant within specific niches.
Beyond embedded applications, several vendors are actively developing
application-level LoongArch32 processors. Loongson, for example, has
released two open-source reference hardware implementations: openLA500
and openLA1000 [6].

The architecture also holds considerable educational value, having been
integrated into China's national computer architecture curricula and
embedded systems courses. Additionally, the National Student Computer
System Capability Challenge (NSCSCC) [1] features LoongArch32 CPUs, where
hundreds of students design Linux-capable hardware implementations and
compete on performance. This initiative has resulted in several exciting
high-performance LoongArch32 cores, including LainCore[2], Wired[3],
NOP-Core[4], NagiCore[5]....

From an upstream perspective, we will largely reuse the infrastructure
already established for LoongArch64, ensuring that the maintenance burden
remains minimal.

Porting Status
==============
The LoongArch32 port has been available downstream for some time, with
various system components hosted on Loongson's Gitee[6]. However, these
components utilise an older downstream ABI and fall short of upstream
quality.

On the upstream front, LLVM-19 now includes experimental support for
LoongArch32 (ILP32 ABI) under the loongarch32* triple, and efforts are
underway to enable GNU toolchain support. My upstream-ready kernel port
and musl libc port can successfully boot into a minimal Buildroot
environment and execute test cases on QEMU virt machine with clang
toolchain.

Thank you for reading. I look forward to your comments and feedback.

[1]: https://www.tsinghua.edu.cn/en/info/1245/13802.htm
[2]: https://github.com/LainChip/LainCore
[3]: https://github.com/gmlayer0/wired
[4]: https://github.com/NOP-Processor/NOP-Core
[5]: https://github.com/MrAMS/NagiCore
[6]: https://gitee.com/loongson-edu

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
Jiaxun Yang (3):
      loongarch: Wire up 32 bit syscalls
      loongarch: Introduce sys_loongarch_flush_icache syscall
      loongarch: vdso: Introduce __vdso_flush_icache function

 arch/loongarch/include/asm/Kbuild          |  1 +
 arch/loongarch/include/asm/cacheflush.h    |  6 ++++
 arch/loongarch/include/asm/syscall.h       |  2 ++
 arch/loongarch/include/asm/vdso/vdso.h     | 10 ++++++
 arch/loongarch/include/asm/vdso/vsyscall.h |  1 +
 arch/loongarch/include/uapi/asm/Kbuild     |  1 +
 arch/loongarch/include/uapi/asm/unistd.h   |  6 ++++
 arch/loongarch/kernel/Makefile.syscalls    |  3 +-
 arch/loongarch/kernel/syscall.c            | 49 +++++++++++++++++++++++++++++
 arch/loongarch/kernel/vdso.c               |  2 ++
 arch/loongarch/mm/cache.c                  |  3 ++
 arch/loongarch/vdso/Makefile               |  2 +-
 arch/loongarch/vdso/flush_icache.c         | 50 ++++++++++++++++++++++++++++++
 arch/loongarch/vdso/vdso.lds.S             |  5 +++
 scripts/syscall.tbl                        |  2 ++
 15 files changed, 140 insertions(+), 3 deletions(-)
---
base-commit: 8155b4ef3466f0e289e8fcc9e6e62f3f4dceeac2
change-id: 20250102-la32-uapi-8395e83a4e88

Best regards,
-- 
Jiaxun Yang <jiaxun.yang@flygoat.com>

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Arnd Bergmann 1 year, 1 month ago

On Thu, Jan 2, 2025, at 19:34, Jiaxun Yang wrote:

> Why am I upstreaming LoongArch32?
> ================================
> Although 32-bit systems are experiencing declining adoption in general
> computing, LoongArch32 remains highly relevant within specific niches.
> Beyond embedded applications, several vendors are actively developing
> application-level LoongArch32 processors. Loongson, for example, has
> released two open-source reference hardware implementations: openLA500
> and openLA1000 [6].
>
> The architecture also holds considerable educational value, having been
> integrated into China's national computer architecture curricula and
> embedded systems courses. Additionally, the National Student Computer
> System Capability Challenge (NSCSCC) [1] features LoongArch32 CPUs, where
> hundreds of students design Linux-capable hardware implementations and
> compete on performance. This initiative has resulted in several exciting
> high-performance LoongArch32 cores, including LainCore[2], Wired[3],
> NOP-Core[4], NagiCore[5]....

I'm surprised that so many resources get put into 32-bit hardware
implementations on loongarch, when this has mostly stopped on riscv
and arm, where new hardware is practically all either 64-bit Linux
or 32-bit NOMMU microcontrollers.

> From an upstream perspective, we will largely reuse the infrastructure
> already established for LoongArch64, ensuring that the maintenance burden
> remains minimal.
>
> Porting Status
> ==============
> The LoongArch32 port has been available downstream for some time, with
> various system components hosted on Loongson's Gitee[6]. However, these
> components utilise an older downstream ABI and fall short of upstream
> quality.
>
> On the upstream front, LLVM-19 now includes experimental support for
> LoongArch32 (ILP32 ABI) under the loongarch32* triple, and efforts are
> underway to enable GNU toolchain support. My upstream-ready kernel port
> and musl libc port can successfully boot into a minimal Buildroot
> environment and execute test cases on QEMU virt machine with clang
> toolchain.

I assume the MIPS legacy means that a 64-bit kernel is going to be
able to run the same ILP32 binaries as a 32-bit kernel running on
pure 32-bit hardware, similar to powerpc/s390/x86, but unlike
riscv/arm?

We need to be careful in defining the ABI to ensure that this covers
all the corner cases, such as defining a signal stack layout with
room to save 64-bit user register contents if there is a chance that
a 32-bit userspace will end up using the wide registers when
running on a 64-bit kernel, but also avoid any dependency on 64-bit
registers in the ABI itself.

    Arnd

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Xi Ruoyao 1 year, 1 month ago

On Sat, 2025-01-04 at 16:00 +0100, Arnd Bergmann wrote:
> On Thu, Jan 2, 2025, at 19:34, Jiaxun Yang wrote:
> 
> > Why am I upstreaming LoongArch32?
> > ================================
> > Although 32-bit systems are experiencing declining adoption in general
> > computing, LoongArch32 remains highly relevant within specific niches.
> > Beyond embedded applications, several vendors are actively developing
> > application-level LoongArch32 processors. Loongson, for example, has
> > released two open-source reference hardware implementations: openLA500
> > and openLA1000 [6].
> > 
> > The architecture also holds considerable educational value, having been
> > integrated into China's national computer architecture curricula and
> > embedded systems courses. Additionally, the National Student Computer
> > System Capability Challenge (NSCSCC) [1] features LoongArch32 CPUs, where
> > hundreds of students design Linux-capable hardware implementations and
> > compete on performance. This initiative has resulted in several exciting
> > high-performance LoongArch32 cores, including LainCore[2], Wired[3],
> > NOP-Core[4], NagiCore[5]....
> 
> I'm surprised that so many resources get put into 32-bit hardware
> implementations on loongarch, when this has mostly stopped on riscv
> and arm, where new hardware is practically all either 64-bit Linux
> or 32-bit NOMMU microcontrollers.
> 
> > From an upstream perspective, we will largely reuse the infrastructure
> > already established for LoongArch64, ensuring that the maintenance burden
> > remains minimal.
> > 
> > Porting Status
> > ==============
> > The LoongArch32 port has been available downstream for some time, with
> > various system components hosted on Loongson's Gitee[6]. However, these
> > components utilise an older downstream ABI and fall short of upstream
> > quality.
> > 
> > On the upstream front, LLVM-19 now includes experimental support for
> > LoongArch32 (ILP32 ABI) under the loongarch32* triple, and efforts are
> > underway to enable GNU toolchain support. My upstream-ready kernel port
> > and musl libc port can successfully boot into a minimal Buildroot
> > environment and execute test cases on QEMU virt machine with clang
> > toolchain.
> 
> I assume the MIPS legacy means that a 64-bit kernel is going to be
> able to run the same ILP32 binaries as a 32-bit kernel running on
> pure 32-bit hardware, similar to powerpc/s390/x86, but unlike
> riscv/arm?

LoongArch has instructions like addi.d/addi.w, instead of addi/addi.w,
thus on 32-bit implementation it's simply addi.d is missing, not the
semantic of addi is changed.  So I cannot see a real reason we cannot
support the same ILP32 userspace binaries compiled for 32-bit hardware
on 64-bit hardware.

> We need to be careful in defining the ABI to ensure that this covers
> all the corner cases, such as defining a signal stack layout with
> room to save 64-bit user register contents if there is a chance that
> a 32-bit userspace will end up using the wide registers when
> running on a 64-bit kernel, but also avoid any dependency on 64-bit
> registers in the ABI itself.

Yes such issues are nasty, we'd already need something in the calling
convention like "on 64-bit hardware, in ILP32 ABI the saved registers
may be unchanged or changed to the sign-extension from the lower 32 bits
of the original value."

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Jiaxun Yang 1 year, 1 month ago


在2025年1月4日一月 下午3:13，Xi Ruoyao写道：
[...]
>> 
>> I assume the MIPS legacy means that a 64-bit kernel is going to be
>> able to run the same ILP32 binaries as a 32-bit kernel running on
>> pure 32-bit hardware, similar to powerpc/s390/x86, but unlike
>> riscv/arm?
>
> LoongArch has instructions like addi.d/addi.w, instead of addi/addi.w,
> thus on 32-bit implementation it's simply addi.d is missing, not the
> semantic of addi is changed.  So I cannot see a real reason we cannot
> support the same ILP32 userspace binaries compiled for 32-bit hardware
> on 64-bit hardware.

The only concern is the behaviour of PC relative instructions will change
in VA32 mode for LoongArch64 systems, i.e. address will be signed extended.
However I think this serves ILP32 purpose well.

>
>> We need to be careful in defining the ABI to ensure that this covers
>> all the corner cases, such as defining a signal stack layout with
>> room to save 64-bit user register contents if there is a chance that
>> a 32-bit userspace will end up using the wide registers when
>> running on a 64-bit kernel, but also avoid any dependency on 64-bit
>> registers in the ABI itself.
>
> Yes such issues are nasty, we'd already need something in the calling
> convention like "on 64-bit hardware, in ILP32 ABI the saved registers
> may be unchanged or changed to the sign-extension from the lower 32 bits
> of the original value."

Makes sense to me. For MIPS the n32 (ILP32 for 64bit) ABI has a new set of
UAPI definition (also mandate 64bit GPR). While the vanilla o32 ABI is 32bit
only, which disallows any 64bit instruction in user space.

When I'm designing current LA32 ABI I actually have o32 ABI in mind. However
LoongArch64 hardware is not capable to disable 32bit instructions alone. So
if we end up doing something like o32 the limitation of 32bit instruction needs
to be enforced at compiler side.

So I think the question would do we want to allow 64bit instructions for
LoongArch's ILP32 kernel UAPI. We can either go through MIPS's o32 PATH,
making 32bit ABI truly 32bit, or maybe reusing the UAPI for ILP32 on 64.

From Guo Ren's RISC-V's compat work and arm64ilp32 I can certainly see
the benefit of ILP32 on 64. Maybe we can bring that to LoongArch as well.

Thanks 

>
> -- 
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University

-- 
- Jiaxun

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Arnd Bergmann 1 year, 1 month ago

On Sat, Jan 4, 2025, at 17:03, Jiaxun Yang wrote:
> 在2025年1月4日一月 下午3:13，Xi Ruoyao写道：
>>
>>> We need to be careful in defining the ABI to ensure that this covers
>>> all the corner cases, such as defining a signal stack layout with
>>> room to save 64-bit user register contents if there is a chance that
>>> a 32-bit userspace will end up using the wide registers when
>>> running on a 64-bit kernel, but also avoid any dependency on 64-bit
>>> registers in the ABI itself.
>>
>> Yes such issues are nasty, we'd already need something in the calling
>> convention like "on 64-bit hardware, in ILP32 ABI the saved registers
>> may be unchanged or changed to the sign-extension from the lower 32 bits
>> of the original value."
>
> Makes sense to me. For MIPS the n32 (ILP32 for 64bit) ABI has a new set of
> UAPI definition (also mandate 64bit GPR). While the vanilla o32 ABI is 32bit
> only, which disallows any 64bit instruction in user space.
>
> When I'm designing current LA32 ABI I actually have o32 ABI in mind. However
> LoongArch64 hardware is not capable to disable 32bit instructions alone. So
> if we end up doing something like o32 the limitation of 32bit instruction needs
> to be enforced at compiler side.

> So I think the question would do we want to allow 64bit instructions for
> LoongArch's ILP32 kernel UAPI. We can either go through MIPS's o32 PATH,
> making 32bit ABI truly 32bit, or maybe reusing the UAPI for ILP32 on 64.

If at all possible, I think both the kernel's UAPI and the user side
ELF psABI should be defined as compatible with 32-bit hardware and
with userspace running on 64-bit kernels.

> From Guo Ren's RISC-V's compat work and arm64ilp32 I can certainly see
> the benefit of ILP32 on 64. Maybe we can bring that to LoongArch as well.

I would not take these as examples, since something went wrong for
each of them:

- RISC-V defined rv64 to not be a superset of rv32, so arithmetic
  instructions behave differently unless you switch modes
- aarch64 and aarch32 modes are completely different instruction sets,
  so aarch64ilp32 is by definition incompatible
- mips o32 as I understand it could work with 64-bit at the ISA level,
  as n32 does, but the ELF ABI does not allow using 64-bit registers,
  while n32 requires the use of 64-bit registers and does not work
  on 32-bit hardware.

If both the ISA and the ABI get it right, it should be possible to
build 32-bit userspace that is compatible with both when targeting
a 32-bit hardware, but still use 64-bit registers inside a single
function when the compiler is building for a 64-bit capable CPU
(e.g. "-march=la464 -m32"). There is a small cost in the calling
conventions for passing u64 arguments in pairs of registers
(unlike n32/x32/aarch64ilp32/rv64ilp32), but a huge benefit in
not maintaining two incompatible ABIs.

     Arnd

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Jiaxun Yang 1 year, 1 month ago


在2025年1月5日一月 上午4:43，Arnd Bergmann写道：
[...]
> If both the ISA and the ABI get it right, it should be possible to
> build 32-bit userspace that is compatible with both when targeting
> a 32-bit hardware, but still use 64-bit registers inside a single
> function when the compiler is building for a 64-bit capable CPU
> (e.g. "-march=la464 -m32"). There is a small cost in the calling
> conventions for passing u64 arguments in pairs of registers
> (unlike n32/x32/aarch64ilp32/rv64ilp32), but a huge benefit in
> not maintaining two incompatible ABIs.

Thanks Arnd for elaborating this!

I actually more or less have this in mind when I was designing this ABI,
thus GRs were designed to be 64bit in sigcontext. But I never look into
that closely.

I'll try to explore that option, maybe come up with a COMPAT implementation
first.

Thanks
>
>      Arnd

-- 
- Jiaxun

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Jiaxun Yang 1 year, 1 month ago


在2025年1月5日一月 上午10:27，Jiaxun Yang写道：
> 在2025年1月5日一月 上午4:43，Arnd Bergmann写道：
> [...]
>> If both the ISA and the ABI get it right, it should be possible to
>> build 32-bit userspace that is compatible with both when targeting
>> a 32-bit hardware, but still use 64-bit registers inside a single
>> function when the compiler is building for a 64-bit capable CPU
>> (e.g. "-march=la464 -m32"). There is a small cost in the calling
>> conventions for passing u64 arguments in pairs of registers
>> (unlike n32/x32/aarch64ilp32/rv64ilp32), but a huge benefit in
>> not maintaining two incompatible ABIs.
>

Upon having a closer look, I think there's an issue regarding having uniformed
ABI for LA32 and LA64. I'll call them ILP32GRLEN32 and ILP32GRLEN64 below.

If we allow interlinking, we must treat the upper 32 bits of all GPRs as
caller-saved, since an ILP32GRLEN32 callee would be unaware of them. This
could incur significant performance overhead.

Alternatively, we can disallow interlinking. However, this effectively creates
a new, incompatible ABI. I'm not sure if it still fits our design goal.

Thanks

>>      Arnd
>
> -- 
> - Jiaxun

-- 
- Jiaxun

Re: [PATCH 0/3] LoongArch: initial 32-bit UAPI

Posted by Jinyang Shen 1 year, 1 month ago


On 2025/1/3 02:34, Jiaxun Yang wrote:
> This series defines the UAPI for LoongArch32, marking my initial step
> towards upstreaming support for the architecture. Once the UAPI is
> ratified, we can proceed to scrutinise various kernel components to
> enable 32-bit support while simultaneously addressing user-space porting.
> 
> Why am I upstreaming LoongArch32?
> ================================
> Although 32-bit systems are experiencing declining adoption in general
> computing, LoongArch32 remains highly relevant within specific niches.
> Beyond embedded applications, several vendors are actively developing
> application-level LoongArch32 processors. Loongson, for example, has
> released two open-source reference hardware implementations: openLA500
> and openLA1000 [6].
> 
> The architecture also holds considerable educational value, having been
> integrated into China's national computer architecture curricula and
> embedded systems courses. Additionally, the National Student Computer
> System Capability Challenge (NSCSCC) [1] features LoongArch32 CPUs, where
> hundreds of students design Linux-capable hardware implementations and
> compete on performance. This initiative has resulted in several exciting
> high-performance LoongArch32 cores, including LainCore[2], Wired[3],
> NOP-Core[4], NagiCore[5]....
> 
>>From an upstream perspective, we will largely reuse the infrastructure
> already established for LoongArch64, ensuring that the maintenance burden
> remains minimal.
> 
> Porting Status
> ==============
> The LoongArch32 port has been available downstream for some time, with
> various system components hosted on Loongson's Gitee[6]. However, these
> components utilise an older downstream ABI and fall short of upstream
> quality.
> 
> On the upstream front, LLVM-19 now includes experimental support for
> LoongArch32 (ILP32 ABI) under the loongarch32* triple, and efforts are
> underway to enable GNU toolchain support. My upstream-ready kernel port
> and musl libc port can successfully boot into a minimal Buildroot
> environment and execute test cases on QEMU virt machine with clang
> toolchain.
> 
> Thank you for reading. I look forward to your comments and feedback.
> 
> [1]: https://www.tsinghua.edu.cn/en/info/1245/13802.htm
> [2]: https://github.com/LainChip/LainCore
> [3]: https://github.com/gmlayer0/wired
> [4]: https://github.com/NOP-Processor/NOP-Core
> [5]: https://github.com/MrAMS/NagiCore
> [6]: https://gitee.com/loongson-edu
> 
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> ---
> Jiaxun Yang (3):
>        loongarch: Wire up 32 bit syscalls
>        loongarch: Introduce sys_loongarch_flush_icache syscall
>        loongarch: vdso: Introduce __vdso_flush_icache function
> 
>   arch/loongarch/include/asm/Kbuild          |  1 +
>   arch/loongarch/include/asm/cacheflush.h    |  6 ++++
>   arch/loongarch/include/asm/syscall.h       |  2 ++
>   arch/loongarch/include/asm/vdso/vdso.h     | 10 ++++++
>   arch/loongarch/include/asm/vdso/vsyscall.h |  1 +
>   arch/loongarch/include/uapi/asm/Kbuild     |  1 +
>   arch/loongarch/include/uapi/asm/unistd.h   |  6 ++++
>   arch/loongarch/kernel/Makefile.syscalls    |  3 +-
>   arch/loongarch/kernel/syscall.c            | 49 +++++++++++++++++++++++++++++
>   arch/loongarch/kernel/vdso.c               |  2 ++
>   arch/loongarch/mm/cache.c                  |  3 ++
>   arch/loongarch/vdso/Makefile               |  2 +-
>   arch/loongarch/vdso/flush_icache.c         | 50 ++++++++++++++++++++++++++++++
>   arch/loongarch/vdso/vdso.lds.S             |  5 +++
>   scripts/syscall.tbl                        |  2 ++
>   15 files changed, 140 insertions(+), 3 deletions(-)
> ---
> base-commit: 8155b4ef3466f0e289e8fcc9e6e62f3f4dceeac2
> change-id: 20250102-la32-uapi-8395e83a4e88
> 
> Best regards,

Hi, Jiaxun,

Thank you for your hard work, I'm also working on LoongArch32 kernel 
side [1], I hope we can make it upstream together.

[1]: https://github.com/shenjinyang/la32r-Linux

Jinyang