[PATCH v2 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension

Kuan-Wei Chiu posted 3 patches 6 months, 4 weeks ago
There is a newer version of this series
[PATCH v2 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension
Posted by Kuan-Wei Chiu 6 months, 4 weeks ago
The binary GCD implementation uses FFS (find first set), which benefits
from hardware support for the ctz instruction, provided by the Zbb
extension on RISC-V. Without Zbb, this results in slower
software-emulated behavior.

Previously, RISC-V always used the binary GCD, regardless of actual
hardware support. This patch improves runtime efficiency by disabling
the efficient_ffs_key static branch when Zbb is either not enabled in
the kernel (config) or not supported on the executing CPU. This selects
the odd-even GCD implementation, which is faster in the absence of
efficient FFS.

This change ensures the most suitable GCD algorithm is chosen
dynamically based on actual hardware capabilities.

Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
 arch/riscv/kernel/setup.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index f7c9a1caa83e..f891eedc3644 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -21,6 +21,7 @@
 #include <linux/efi.h>
 #include <linux/crash_dump.h>
 #include <linux/panic_notifier.h>
+#include <linux/jump_label.h>
 
 #include <asm/acpi.h>
 #include <asm/alternative.h>
@@ -51,6 +52,8 @@ atomic_t hart_lottery __section(".sdata")
 ;
 unsigned long boot_cpu_hartid;
 
+DECLARE_STATIC_KEY_TRUE(efficient_ffs_key);
+
 /*
  * Place kernel memory regions on the resource tree so that
  * kexec-tools can retrieve them from /proc/iomem. While there
@@ -361,6 +364,9 @@ void __init setup_arch(char **cmdline_p)
 
 	riscv_user_isa_enable();
 	riscv_spinlock_init();
+
+	if (!IS_ENABLED(CONFIG_RISCV_ISA_ZBB) || !riscv_isa_extension_available(NULL, ZBB))
+		static_branch_disable(&efficient_ffs_key);
 }
 
 bool arch_cpu_is_hotpluggable(int cpu)
-- 
2.34.1
Re: [PATCH v2 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension
Posted by Andrew Morton 6 months, 2 weeks ago
On Sat, 24 May 2025 23:55:19 +0800 Kuan-Wei Chiu <visitorckw@gmail.com> wrote:

> The binary GCD implementation uses FFS (find first set), which benefits
> from hardware support for the ctz instruction, provided by the Zbb
> extension on RISC-V. Without Zbb, this results in slower
> software-emulated behavior.
> 
> Previously, RISC-V always used the binary GCD, regardless of actual
> hardware support. This patch improves runtime efficiency by disabling
> the efficient_ffs_key static branch when Zbb is either not enabled in
> the kernel (config) or not supported on the executing CPU. This selects
> the odd-even GCD implementation, which is faster in the absence of
> efficient FFS.
> 
> This change ensures the most suitable GCD algorithm is chosen
> dynamically based on actual hardware capabilities.
> 
> ...
>
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -21,6 +21,7 @@
>  #include <linux/efi.h>
>  #include <linux/crash_dump.h>
>  #include <linux/panic_notifier.h>
> +#include <linux/jump_label.h>
>  
>  #include <asm/acpi.h>
>  #include <asm/alternative.h>
> @@ -51,6 +52,8 @@ atomic_t hart_lottery __section(".sdata")
>  ;
>  unsigned long boot_cpu_hartid;
>  
> +DECLARE_STATIC_KEY_TRUE(efficient_ffs_key);

Please let's get this into a header file, visible to the definition
site and to all users.