[v1] xen/bitops: hweight() cleanup/improvements

[PATCH 8/9] xen/bitops: Implement hweight32() in terms of hweightl()

Posted by Andrew Cooper 3 months ago

... and drop generic_hweight32().

As noted previously, the only two users of hweight32() and they're both
singleton callers in __init paths, so it's not interesting to have a sub-GPR
optimised generic.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Oleksii Kurochko <oleksii.kurochko@gmail.com>
CC: Shawn Anastasio <sanastasio@raptorengineering.com>
---
 xen/arch/arm/include/asm/bitops.h |  8 --------
 xen/arch/ppc/include/asm/bitops.h |  8 --------
 xen/arch/x86/include/asm/bitops.h |  8 --------
 xen/include/xen/bitops.h          | 24 +++++-------------------
 4 files changed, 5 insertions(+), 43 deletions(-)

diff --git a/xen/arch/arm/include/asm/bitops.h b/xen/arch/arm/include/asm/bitops.h
index bed6b3b98e08..f163d9bb4578 100644
--- a/xen/arch/arm/include/asm/bitops.h
+++ b/xen/arch/arm/include/asm/bitops.h
@@ -78,14 +78,6 @@ bool clear_mask16_timeout(uint16_t mask, volatile void *p,
 #define arch_fls(x)  ((x) ? 32 - __builtin_clz(x) : 0)
 #define arch_flsl(x) ((x) ? BITS_PER_LONG - __builtin_clzl(x) : 0)
 
-/**
- * hweightN - returns the hamming weight of a N-bit word
- * @x: the word to weigh
- *
- * The Hamming Weight of a number is the total number of bits set in it.
- */
-#define hweight32(x) generic_hweight32(x)
-
 #endif /* _ARM_BITOPS_H */
 /*
  * Local variables:
diff --git a/xen/arch/ppc/include/asm/bitops.h b/xen/arch/ppc/include/asm/bitops.h
index 24dc35ef644d..c942e9432e20 100644
--- a/xen/arch/ppc/include/asm/bitops.h
+++ b/xen/arch/ppc/include/asm/bitops.h
@@ -126,12 +126,4 @@ static inline int test_and_set_bit(unsigned int nr, volatile void *addr)
 
 #define arch_hweightl(x) __builtin_popcountl(x)
 
-/**
- * hweightN - returns the hamming weight of a N-bit word
- * @x: the word to weigh
- *
- * The Hamming Weight of a number is the total number of bits set in it.
- */
-#define hweight32(x) __builtin_popcount(x)
-
 #endif /* _ASM_PPC_BITOPS_H */
diff --git a/xen/arch/x86/include/asm/bitops.h b/xen/arch/x86/include/asm/bitops.h
index 9d3a2448036e..642d8e58b288 100644
--- a/xen/arch/x86/include/asm/bitops.h
+++ b/xen/arch/x86/include/asm/bitops.h
@@ -475,12 +475,4 @@ static always_inline unsigned int arch_flsl(unsigned long x)
 }
 #define arch_flsl arch_flsl
 
-/**
- * hweightN - returns the hamming weight of a N-bit word
- * @x: the word to weigh
- *
- * The Hamming Weight of a number is the total number of bits set in it.
- */
-#define hweight32(x) generic_hweight32(x)
-
 #endif /* _X86_BITOPS_H */
diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
index e97516552a2e..bad2601b0fe6 100644
--- a/xen/include/xen/bitops.h
+++ b/xen/include/xen/bitops.h
@@ -302,6 +302,11 @@ static always_inline __pure unsigned int hweightl(unsigned long x)
 #endif
 }
 
+static always_inline __pure unsigned int hweight32(uint32_t x)
+{
+    return hweightl(x);
+}
+
 static always_inline __pure unsigned int hweight64(uint64_t x)
 {
     if ( BITS_PER_LONG == 64 )
@@ -378,25 +383,6 @@ static inline int get_count_order(unsigned int count)
     return order;
 }
 
-/*
- * hweightN: returns the hamming weight (i.e. the number
- * of bits set) of a N-bit word
- */
-
-static inline unsigned int generic_hweight32(unsigned int w)
-{
-    w -= (w >> 1) & 0x55555555;
-    w =  (w & 0x33333333) + ((w >> 2) & 0x33333333);
-    w =  (w + (w >> 4)) & 0x0f0f0f0f;
-
-    if ( IS_ENABLED(CONFIG_HAS_FAST_MULTIPLY) )
-        return (w * 0x01010101) >> 24;
-
-    w += w >> 8;
-
-    return (w + (w >> 16)) & 0xff;
-}
-
 /*
  * rol32 - rotate a 32-bit value left
  *
-- 
2.39.2

Re: [PATCH 8/9] xen/bitops: Implement hweight32() in terms of hweightl()

Posted by Jan Beulich 2 months, 4 weeks ago

On 23.08.2024 01:06, Andrew Cooper wrote:
> ... and drop generic_hweight32().
> 
> As noted previously, the only two users of hweight32() and they're both
> singleton callers in __init paths, so it's not interesting to have a sub-GPR
> optimised generic.

I think it's clear what is meant, but the part of the sentence ahead of
the comma is a little bumpy. As to not interesting: Perhaps indeed not
right now, but new uses may appear and generally the overly wide
operations may be (slightly) more expensive. Of course we can deal with
the need when it arises, so ...

> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

Re: [PATCH 8/9] xen/bitops: Implement hweight32() in terms of hweightl()

Posted by Andrew Cooper 2 months, 4 weeks ago

On 26/08/2024 12:59 pm, Jan Beulich wrote:
> On 23.08.2024 01:06, Andrew Cooper wrote:
>> ... and drop generic_hweight32().
>>
>> As noted previously, the only two users of hweight32() and they're both
>> singleton callers in __init paths, so it's not interesting to have a sub-GPR
>> optimised generic.
> I think it's clear what is meant, but the part of the sentence ahead of
> the comma is a little bumpy. As to not interesting: Perhaps indeed not
> right now, but new uses may appear and generally the overly wide
> operations may be (slightly) more expensive. Of course we can deal with
> the need when it arises, so ...

Oh yes, that is wonky.  I'll rephrase.

>> No functional change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

~Andrew