Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Depends on "xen/riscv: make zbb as mandatory"
---
xen/arch/riscv/include/asm/bitops.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/xen/arch/riscv/include/asm/bitops.h b/xen/arch/riscv/include/asm/bitops.h
index d22eec1e87c7..df3df93520c5 100644
--- a/xen/arch/riscv/include/asm/bitops.h
+++ b/xen/arch/riscv/include/asm/bitops.h
@@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p)
#undef NOT
#undef __AMO
+#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0)
+#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0)
+#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0)
+#define arch_flsl(x) ((x) ? BITS_PER_LONG - __builtin_clzl(x) : 0)
+
+#define arch_heightl(x) __builtin_popcountl(x)
+
#endif /* ASM__RISCV__BITOPS_H */
/*
base-commit: 7cf163879c5add0a4f7f9c987b61f04f8f7051b1
prerequisite-patch-id: 9ee1f7ebf5d34b1c565ee2d3d4fb319164bb8bcd
prerequisite-patch-id: 8a05c87c8d051a3ac0820887f676bbd318e4ae88
prerequisite-patch-id: 6b56e42d130d8b5ee39457b6760b05cc6e16b049
prerequisite-patch-id: c139f1f5741d695cd5e5aa6be904edcb61b73885
prerequisite-patch-id: 99f8b701000e9ee11060934e627a988ddf9aaaa7
--
2.39.5
On 26.02.2025 18:20, Andrew Cooper wrote: > --- a/xen/arch/riscv/include/asm/bitops.h > +++ b/xen/arch/riscv/include/asm/bitops.h > @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p) > #undef NOT > #undef __AMO > > +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0) > +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0) > +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0) I fear you won't like me to say this, but can't we avoid baking in yet another assumption on sizeof(int) == 4, by using at least sizeof(int) * 8 here (yet better might be sizeof(int) * BITS_PER_BYTE)? > +#define arch_flsl(x) ((x) ? BITS_PER_LONG - __builtin_clzl(x) : 0) > + > +#define arch_heightl(x) __builtin_popcountl(x) arch_hweightl()? Jan
On 27/02/2025 8:11 am, Jan Beulich wrote: > On 26.02.2025 18:20, Andrew Cooper wrote: >> --- a/xen/arch/riscv/include/asm/bitops.h >> +++ b/xen/arch/riscv/include/asm/bitops.h >> @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p) >> #undef NOT >> #undef __AMO >> >> +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0) >> +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0) >> +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0) > I fear you won't like me to say this, but can't we avoid baking in yet > another assumption on sizeof(int) == 4, by using at least sizeof(int) * 8 > here (yet better might be sizeof(int) * BITS_PER_BYTE)? Yes and no. No, because 32 here is consistent with ARM and PPC when it comes to arch_fls(). Given the effort it took to get these consistent, I'm not interested in letting them diverge. But, if someone wants to introduce BITS_PER_INT to mirror BITS_PER_LONG and use it consistently, then that would be ok too. > >> +#define arch_flsl(x) ((x) ? BITS_PER_LONG - __builtin_clzl(x) : 0) >> + >> +#define arch_heightl(x) __builtin_popcountl(x) > arch_hweightl()? Oops yes. And RISC-V does have two uses, via __bitmap_weight. ~Andrew
On 28.02.2025 17:24, Andrew Cooper wrote:
> On 27/02/2025 8:11 am, Jan Beulich wrote:
>> On 26.02.2025 18:20, Andrew Cooper wrote:
>>> --- a/xen/arch/riscv/include/asm/bitops.h
>>> +++ b/xen/arch/riscv/include/asm/bitops.h
>>> @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p)
>>> #undef NOT
>>> #undef __AMO
>>>
>>> +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0)
>>> +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0)
>>> +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0)
>> I fear you won't like me to say this, but can't we avoid baking in yet
>> another assumption on sizeof(int) == 4, by using at least sizeof(int) * 8
>> here (yet better might be sizeof(int) * BITS_PER_BYTE)?
>
> Yes and no.
>
> No, because 32 here is consistent with ARM and PPC when it comes to
> arch_fls(). Given the effort it took to get these consistent, I'm not
> interested in letting them diverge.
>
> But, if someone wants to introduce BITS_PER_INT to mirror BITS_PER_LONG
> and use it consistently, then that would be ok too.
I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
retain a shorthand of that name, if so desired, but I see no reason why
each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
Jan
On 05/03/2025 7:34 am, Jan Beulich wrote:
> On 28.02.2025 17:24, Andrew Cooper wrote:
>> On 27/02/2025 8:11 am, Jan Beulich wrote:
>>> On 26.02.2025 18:20, Andrew Cooper wrote:
>>>> --- a/xen/arch/riscv/include/asm/bitops.h
>>>> +++ b/xen/arch/riscv/include/asm/bitops.h
>>>> @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p)
>>>> #undef NOT
>>>> #undef __AMO
>>>>
>>>> +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0)
>>>> +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0)
>>>> +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0)
>>> I fear you won't like me to say this, but can't we avoid baking in yet
>>> another assumption on sizeof(int) == 4, by using at least sizeof(int) * 8
>>> here (yet better might be sizeof(int) * BITS_PER_BYTE)?
>> Yes and no.
>>
>> No, because 32 here is consistent with ARM and PPC when it comes to
>> arch_fls(). Given the effort it took to get these consistent, I'm not
>> interested in letting them diverge.
>>
>> But, if someone wants to introduce BITS_PER_INT to mirror BITS_PER_LONG
>> and use it consistently, then that would be ok too.
Oleksii: I see your patch is committed, but when I said "use it
consistently", I meant "patch ARM and PPC too".
> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
> retain a shorthand of that name, if so desired, but I see no reason why
> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
The concern is legibility and clarity.
This:
((x) ? 32 - __builtin_clz(x) : 0)
is a clear expression in a way that this:
((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
is not. The problem is the extra binary expression, and this:
((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
is still clear, because the reader doesn't have to perform a multiply to
just to figure out what's going on.
It is definitely stupid to have each architecture provide their own
BITS_PER_*. The compiler is in a superior position to provide those
details, and it should be in a common location.
I don't particularly mind how those constants are derived, but one key
thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
The following files use BITS_PER_LONG preprocessing:
arch/arm/include/asm/div64.h
arch/x86/cpu/mcheck/mce.c
arch/x86/smpboot.c
common/bitops.c
common/coverage/gcov.h
common/coverage/llvm.c
common/cpu.c
common/event_channel.c
common/time.c
common/ubsan/ubsan.c
include/asm-generic/div64.h
include/xen/cpumask.h
include/xen/inttypes.h
include/xen/nodemask.h
include/xen/sched.h
include/xen/xxhash.h
lib/find-next-bit.c
lib/generic-ffsl.c
lib/generic-flsl.c
lib/generic-hweightl.c
And I really don't think they can be replaced with a sizeof().
~Andrew
On 3/6/25 9:19 PM, Andrew Cooper wrote:
> On 05/03/2025 7:34 am, Jan Beulich wrote:
>> On 28.02.2025 17:24, Andrew Cooper wrote:
>>> On 27/02/2025 8:11 am, Jan Beulich wrote:
>>>> On 26.02.2025 18:20, Andrew Cooper wrote:
>>>>> --- a/xen/arch/riscv/include/asm/bitops.h
>>>>> +++ b/xen/arch/riscv/include/asm/bitops.h
>>>>> @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p)
>>>>> #undef NOT
>>>>> #undef __AMO
>>>>>
>>>>> +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0)
>>>>> +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0)
>>>>> +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0)
>>>> I fear you won't like me to say this, but can't we avoid baking in yet
>>>> another assumption on sizeof(int) == 4, by using at least sizeof(int) * 8
>>>> here (yet better might be sizeof(int) * BITS_PER_BYTE)?
>>> Yes and no.
>>>
>>> No, because 32 here is consistent with ARM and PPC when it comes to
>>> arch_fls(). Given the effort it took to get these consistent, I'm not
>>> interested in letting them diverge.
>>>
>>> But, if someone wants to introduce BITS_PER_INT to mirror BITS_PER_LONG
>>> and use it consistently, then that would be ok too.
> Oleksii: I see your patch is committed, but when I said "use it
> consistently", I meant "patch ARM and PPC too".
I planned to do that in a separate patch next week.
>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>> retain a shorthand of that name, if so desired, but I see no reason why
>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
> The concern is legibility and clarity.
>
> This:
>
> ((x) ? 32 - __builtin_clz(x) : 0)
>
> is a clear expression in a way that this:
>
> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>
> is not. The problem is the extra binary expression, and this:
>
> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>
> is still clear, because the reader doesn't have to perform a multiply to
> just to figure out what's going on.
>
>
> It is definitely stupid to have each architecture provide their own
> BITS_PER_*. The compiler is in a superior position to provide those
> details, and it should be in a common location.
>
> I don't particularly mind how those constants are derived, but one key
> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
#define BYTES_PER_INT (1 << INT_BYTEORDER)
#define BITS_PER_INT (BYTES_PER_INT << 3)
#define BYTES_PER_LONG (1 << LONG_BYTEORDER)
#define BITS_PER_LONG (BYTES_PER_LONG << 3)
#define BITS_PER_BYTE 8
Also, it seems like the follwoing could be moved there too:
#define POINTER_ALIGN BYTES_PER_LONG
#define BITS_PER_LLONG 64
#define BITS_PER_BYTE 8
Only, BITS_PER_XEN_ULONG is looking different for x86, so should be still in arch specific config.h.
~ Oleksii
>
> The following files use BITS_PER_LONG preprocessing:
>
> arch/arm/include/asm/div64.h
> arch/x86/cpu/mcheck/mce.c
> arch/x86/smpboot.c
> common/bitops.c
> common/coverage/gcov.h
> common/coverage/llvm.c
> common/cpu.c
> common/event_channel.c
> common/time.c
> common/ubsan/ubsan.c
> include/asm-generic/div64.h
> include/xen/cpumask.h
> include/xen/inttypes.h
> include/xen/nodemask.h
> include/xen/sched.h
> include/xen/xxhash.h
> lib/find-next-bit.c
> lib/generic-ffsl.c
> lib/generic-flsl.c
> lib/generic-hweightl.c
>
> And I really don't think they can be replaced with a sizeof().
>
> ~Andrew
On 07.03.2025 12:50, Oleksii Kurochko wrote:
> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>> retain a shorthand of that name, if so desired, but I see no reason why
>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>> The concern is legibility and clarity.
>>
>> This:
>>
>> ((x) ? 32 - __builtin_clz(x) : 0)
>>
>> is a clear expression in a way that this:
>>
>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>
>> is not. The problem is the extra binary expression, and this:
>>
>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>
>> is still clear, because the reader doesn't have to perform a multiply to
>> just to figure out what's going on.
>>
>>
>> It is definitely stupid to have each architecture provide their own
>> BITS_PER_*. The compiler is in a superior position to provide those
>> details, and it should be in a common location.
>>
>> I don't particularly mind how those constants are derived, but one key
>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>
> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>
> #define BYTES_PER_INT (1 << INT_BYTEORDER)
> #define BITS_PER_INT (BYTES_PER_INT << 3)
>
> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
> #define BITS_PER_BYTE 8
>
> Also, it seems like the follwoing could be moved there too:
>
> #define POINTER_ALIGN BYTES_PER_LONG
This one is likely fine to move.
> #define BITS_PER_LLONG 64
This one is only fine to move imo when converted to
#define BITS_PER_LONG (BYTES_PER_LLONG << 3)
> #define BITS_PER_BYTE 8
Personally I'd rather leave this per-arch. The others can truly be derived;
this one can't be. If we centralize, imo we should also convert the " << 3"
to " * BITS_PER_BYTE".
Jan
On 07/03/2025 12:01 pm, Jan Beulich wrote:
> On 07.03.2025 12:50, Oleksii Kurochko wrote:
>> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>>> retain a shorthand of that name, if so desired, but I see no reason why
>>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>>> The concern is legibility and clarity.
>>>
>>> This:
>>>
>>> ((x) ? 32 - __builtin_clz(x) : 0)
>>>
>>> is a clear expression in a way that this:
>>>
>>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>>
>>> is not. The problem is the extra binary expression, and this:
>>>
>>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>>
>>> is still clear, because the reader doesn't have to perform a multiply to
>>> just to figure out what's going on.
>>>
>>>
>>> It is definitely stupid to have each architecture provide their own
>>> BITS_PER_*. The compiler is in a superior position to provide those
>>> details, and it should be in a common location.
>>>
>>> I don't particularly mind how those constants are derived, but one key
>>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>>
>> #define BYTES_PER_INT (1 << INT_BYTEORDER)
>> #define BITS_PER_INT (BYTES_PER_INT << 3)
>>
>> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>> #define BITS_PER_BYTE 8
The *_BYTEORDER's are useless indirection. BITS_PER_* should be defined
straight up, and this will simplify quite a lot of preprocessing.
>>
>> Also, it seems like the follwoing could be moved there too:
>>
>> #define POINTER_ALIGN BYTES_PER_LONG
> This one is likely fine to move.
>
>> #define BITS_PER_LLONG 64
> This one is only fine to move imo when converted to
>
> #define BITS_PER_LONG (BYTES_PER_LLONG << 3)
Erm? That's mixing long and long long types. Presuming that's an
errant L, then sure.
>
>> #define BITS_PER_BYTE 8
> Personally I'd rather leave this per-arch. The others can truly be derived;
> this one can't be. If we centralize, imo we should also convert the " << 3"
> to " * BITS_PER_BYTE".
It is highly unlikely that Xen will ever run on a system where CHAR_BIT
isn't 8.
So I suggest it stays central to reduce complexity. If an arch ever
needs to change it, the complexity can be added then.
~Andrew
On 3/7/25 1:12 PM, Andrew Cooper wrote:
> On 07/03/2025 12:01 pm, Jan Beulich wrote:
>> On 07.03.2025 12:50, Oleksii Kurochko wrote:
>>> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>>>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>>>> retain a shorthand of that name, if so desired, but I see no reason why
>>>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>>>> The concern is legibility and clarity.
>>>>
>>>> This:
>>>>
>>>> ((x) ? 32 - __builtin_clz(x) : 0)
>>>>
>>>> is a clear expression in a way that this:
>>>>
>>>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>>>
>>>> is not. The problem is the extra binary expression, and this:
>>>>
>>>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>>>
>>>> is still clear, because the reader doesn't have to perform a multiply to
>>>> just to figure out what's going on.
>>>>
>>>>
>>>> It is definitely stupid to have each architecture provide their own
>>>> BITS_PER_*. The compiler is in a superior position to provide those
>>>> details, and it should be in a common location.
>>>>
>>>> I don't particularly mind how those constants are derived, but one key
>>>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>>> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>>>
>>> #define BYTES_PER_INT (1 << INT_BYTEORDER)
>>> #define BITS_PER_INT (BYTES_PER_INT << 3)
>>>
>>> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>>> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>>> #define BITS_PER_BYTE 8
> The *_BYTEORDER's are useless indirection. BITS_PER_* should be defined
> straight up, and this will simplify quite a lot of preprocessing.
Could we really drop *_BYTEORDER?
For example, LONG_BYTEORDER for Arm could be either 2 or 3 depends on Arm32 or Arm64 is used.
>
>>> Also, it seems like the follwoing could be moved there too:
>>>
>>> #define POINTER_ALIGN BYTES_PER_LONG
>> This one is likely fine to move.
>>
>>> #define BITS_PER_LLONG 64
>> This one is only fine to move imo when converted to
>>
>> #define BITS_PER_LONG (BYTES_PER_LLONG << 3)
> Erm? That's mixing long and long long types. Presuming that's an
> errant L, then sure.
>
>>> #define BITS_PER_BYTE 8
>> Personally I'd rather leave this per-arch. The others can truly be derived;
>> this one can't be. If we centralize, imo we should also convert the " << 3"
>> to " * BITS_PER_BYTE".
> It is highly unlikely that Xen will ever run on a system where CHAR_BIT
> isn't 8.
>
> So I suggest it stays central to reduce complexity. If an arch ever
> needs to change it, the complexity can be added then.
Does it make sense to ifdef that? Or, at least, before defintion of BITS_PER_BYTE something like:
#if CHAR_BIT != 8
#error "CHAR_BIT isn't 8"
#endif
~ Oleksii
On 25.03.2025 17:35, Oleksii Kurochko wrote:
>
> On 3/7/25 1:12 PM, Andrew Cooper wrote:
>> On 07/03/2025 12:01 pm, Jan Beulich wrote:
>>> On 07.03.2025 12:50, Oleksii Kurochko wrote:
>>>> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>>>>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>>>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>>>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>>>>> retain a shorthand of that name, if so desired, but I see no reason why
>>>>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>>>>> The concern is legibility and clarity.
>>>>>
>>>>> This:
>>>>>
>>>>> ((x) ? 32 - __builtin_clz(x) : 0)
>>>>>
>>>>> is a clear expression in a way that this:
>>>>>
>>>>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>>>>
>>>>> is not. The problem is the extra binary expression, and this:
>>>>>
>>>>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>>>>
>>>>> is still clear, because the reader doesn't have to perform a multiply to
>>>>> just to figure out what's going on.
>>>>>
>>>>>
>>>>> It is definitely stupid to have each architecture provide their own
>>>>> BITS_PER_*. The compiler is in a superior position to provide those
>>>>> details, and it should be in a common location.
>>>>>
>>>>> I don't particularly mind how those constants are derived, but one key
>>>>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>>>> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>>>>
>>>> #define BYTES_PER_INT (1 << INT_BYTEORDER)
>>>> #define BITS_PER_INT (BYTES_PER_INT << 3)
>>>>
>>>> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>>>> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>>>> #define BITS_PER_BYTE 8
>> The *_BYTEORDER's are useless indirection. BITS_PER_* should be defined
>> straight up, and this will simplify quite a lot of preprocessing.
>
> Could we really drop *_BYTEORDER?
>
> For example, LONG_BYTEORDER for Arm could be either 2 or 3 depends on Arm32 or Arm64 is used.
The can still #define BITS_PER_LONG to 32 or 64 respectively, can't they?
>>>> Also, it seems like the follwoing could be moved there too:
>>>>
>>>> #define POINTER_ALIGN BYTES_PER_LONG
>>> This one is likely fine to move.
>>>
>>>> #define BITS_PER_LLONG 64
>>> This one is only fine to move imo when converted to
>>>
>>> #define BITS_PER_LONG (BYTES_PER_LLONG << 3)
>> Erm? That's mixing long and long long types. Presuming that's an
>> errant L, then sure.
>>
>>>> #define BITS_PER_BYTE 8
>>> Personally I'd rather leave this per-arch. The others can truly be derived;
>>> this one can't be. If we centralize, imo we should also convert the " << 3"
>>> to " * BITS_PER_BYTE".
>> It is highly unlikely that Xen will ever run on a system where CHAR_BIT
>> isn't 8.
>>
>> So I suggest it stays central to reduce complexity. If an arch ever
>> needs to change it, the complexity can be added then.
>
> Does it make sense to ifdef that? Or, at least, before defintion of BITS_PER_BYTE something like:
> #if CHAR_BIT != 8
> #error "CHAR_BIT isn't 8"
> #endif
Where would CHAR_BIT come from? Oh, perhaps you mean __CHAR_BIT__? If all
compilers we can build with supply that value, we could indeed centrally
use either
#define BITS_PER_BYTE __CHAR_BIT__
or
#define BITS_PER_BYTE 8
#if BITS_PER_BYTE != __CHAR_BIT__
# error "..."
#endif
Jan
On 3/25/25 5:46 PM, Jan Beulich wrote:
> On 25.03.2025 17:35, Oleksii Kurochko wrote:
>> On 3/7/25 1:12 PM, Andrew Cooper wrote:
>>> On 07/03/2025 12:01 pm, Jan Beulich wrote:
>>>> On 07.03.2025 12:50, Oleksii Kurochko wrote:
>>>>> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>>>>>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>>>>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>>>>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>>>>>> retain a shorthand of that name, if so desired, but I see no reason why
>>>>>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>>>>>> The concern is legibility and clarity.
>>>>>>
>>>>>> This:
>>>>>>
>>>>>> ((x) ? 32 - __builtin_clz(x) : 0)
>>>>>>
>>>>>> is a clear expression in a way that this:
>>>>>>
>>>>>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>>>>>
>>>>>> is not. The problem is the extra binary expression, and this:
>>>>>>
>>>>>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>>>>>
>>>>>> is still clear, because the reader doesn't have to perform a multiply to
>>>>>> just to figure out what's going on.
>>>>>>
>>>>>>
>>>>>> It is definitely stupid to have each architecture provide their own
>>>>>> BITS_PER_*. The compiler is in a superior position to provide those
>>>>>> details, and it should be in a common location.
>>>>>>
>>>>>> I don't particularly mind how those constants are derived, but one key
>>>>>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>>>>> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>>>>>
>>>>> #define BYTES_PER_INT (1 << INT_BYTEORDER)
>>>>> #define BITS_PER_INT (BYTES_PER_INT << 3)
>>>>>
>>>>> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>>>>> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>>>>> #define BITS_PER_BYTE 8
>>> The *_BYTEORDER's are useless indirection. BITS_PER_* should be defined
>>> straight up, and this will simplify quite a lot of preprocessing.
>> Could we really drop *_BYTEORDER?
>>
>> For example, LONG_BYTEORDER for Arm could be either 2 or 3 depends on Arm32 or Arm64 is used.
> The can still #define BITS_PER_LONG to 32 or 64 respectively, can't they?
Yes, but then if we want to move it to xen/config.h then it will be needed to:
in Arm's asm/config.h to have something like:
#ifdef CONFIG_ARM_32
#define BITS_PER_LONG 32
#endif
and then in xen/config.h
#ifndef BITS_PER_LONG
#define BITS_PER_LONG 64
#endif
But I wanted to not have #ifndef BITS_PER_LONG in xen/config.h. (or using CONFIG_ARM_32 in xen/config.h)
If it is okay to have this #ifndef in xen/config.h then we can do in that way.
>
>>>>> Also, it seems like the follwoing could be moved there too:
>>>>>
>>>>> #define POINTER_ALIGN BYTES_PER_LONG
>>>> This one is likely fine to move.
>>>>
>>>>> #define BITS_PER_LLONG 64
>>>> This one is only fine to move imo when converted to
>>>>
>>>> #define BITS_PER_LONG (BYTES_PER_LLONG << 3)
>>> Erm? That's mixing long and long long types. Presuming that's an
>>> errant L, then sure.
>>>
>>>>> #define BITS_PER_BYTE 8
>>>> Personally I'd rather leave this per-arch. The others can truly be derived;
>>>> this one can't be. If we centralize, imo we should also convert the " << 3"
>>>> to " * BITS_PER_BYTE".
>>> It is highly unlikely that Xen will ever run on a system where CHAR_BIT
>>> isn't 8.
>>>
>>> So I suggest it stays central to reduce complexity. If an arch ever
>>> needs to change it, the complexity can be added then.
>> Does it make sense to ifdef that? Or, at least, before defintion of BITS_PER_BYTE something like:
>> #if CHAR_BIT != 8
>> #error "CHAR_BIT isn't 8"
>> #endif
> Where would CHAR_BIT come from? Oh, perhaps you mean __CHAR_BIT__? If all
> compilers we can build with supply that value, we could indeed centrally
> use either
>
> #define BITS_PER_BYTE __CHAR_BIT__
>
> or
>
> #define BITS_PER_BYTE 8
> #if BITS_PER_BYTE != __CHAR_BIT__
> # error "..."
> #endif
Sorry, I meant __CHAR_BIT__.
Thanks.
~ Oleksii
On 25.03.2025 18:04, Oleksii Kurochko wrote:
>
> On 3/25/25 5:46 PM, Jan Beulich wrote:
>> On 25.03.2025 17:35, Oleksii Kurochko wrote:
>>> On 3/7/25 1:12 PM, Andrew Cooper wrote:
>>>> On 07/03/2025 12:01 pm, Jan Beulich wrote:
>>>>> On 07.03.2025 12:50, Oleksii Kurochko wrote:
>>>>>> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>>>>>>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>>>>>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>>>>>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>>>>>>> retain a shorthand of that name, if so desired, but I see no reason why
>>>>>>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>>>>>>> The concern is legibility and clarity.
>>>>>>>
>>>>>>> This:
>>>>>>>
>>>>>>> ((x) ? 32 - __builtin_clz(x) : 0)
>>>>>>>
>>>>>>> is a clear expression in a way that this:
>>>>>>>
>>>>>>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>>>>>>
>>>>>>> is not. The problem is the extra binary expression, and this:
>>>>>>>
>>>>>>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>>>>>>
>>>>>>> is still clear, because the reader doesn't have to perform a multiply to
>>>>>>> just to figure out what's going on.
>>>>>>>
>>>>>>>
>>>>>>> It is definitely stupid to have each architecture provide their own
>>>>>>> BITS_PER_*. The compiler is in a superior position to provide those
>>>>>>> details, and it should be in a common location.
>>>>>>>
>>>>>>> I don't particularly mind how those constants are derived, but one key
>>>>>>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>>>>>> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>>>>>>
>>>>>> #define BYTES_PER_INT (1 << INT_BYTEORDER)
>>>>>> #define BITS_PER_INT (BYTES_PER_INT << 3)
>>>>>>
>>>>>> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>>>>>> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>>>>>> #define BITS_PER_BYTE 8
>>>> The *_BYTEORDER's are useless indirection. BITS_PER_* should be defined
>>>> straight up, and this will simplify quite a lot of preprocessing.
>>> Could we really drop *_BYTEORDER?
>>>
>>> For example, LONG_BYTEORDER for Arm could be either 2 or 3 depends on Arm32 or Arm64 is used.
>> The can still #define BITS_PER_LONG to 32 or 64 respectively, can't they?
>
> Yes, but then if we want to move it to xen/config.h then it will be needed to:
> in Arm's asm/config.h to have something like:
> #ifdef CONFIG_ARM_32
> #define BITS_PER_LONG 32
> #endif
>
> and then in xen/config.h
> #ifndef BITS_PER_LONG
> #define BITS_PER_LONG 64
> #endif
>
> But I wanted to not have #ifndef BITS_PER_LONG in xen/config.h. (or using CONFIG_ARM_32 in xen/config.h)
Whatever the fundamental definitions that can vary per arch - those should
imo live in asm/config.h. For the case here, if we deem *_BYTEORDER as
fundamental, they go there (and BITS_PER_* go into xen/config.h). If we deem
BITS_PER_* fundamental, they go into asm/config.h.
Stuff that we expect to remain uniform across ports (BITS_PER_BYTE, possibly
also BITS_PER_INT BITS_PER_LLONG) may also go into xen/config.h, as long as
we're okay with the possible future churn if a port appeared that has
different needs.
I agree that there better wouldn't be #ifndef for such in xen/config.h.
Jan
On 26/03/2025 10:15 am, Jan Beulich wrote:
> On 25.03.2025 18:04, Oleksii Kurochko wrote:
>> On 3/25/25 5:46 PM, Jan Beulich wrote:
>>> On 25.03.2025 17:35, Oleksii Kurochko wrote:
>>>> On 3/7/25 1:12 PM, Andrew Cooper wrote:
>>>>> On 07/03/2025 12:01 pm, Jan Beulich wrote:
>>>>>> On 07.03.2025 12:50, Oleksii Kurochko wrote:
>>>>>>> On 3/6/25 9:19 PM, Andrew Cooper wrote:
>>>>>>>> On 05/03/2025 7:34 am, Jan Beulich wrote:
>>>>>>>>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>>>>>>>>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>>>>>>>>> retain a shorthand of that name, if so desired, but I see no reason why
>>>>>>>>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>>>>>>>> The concern is legibility and clarity.
>>>>>>>>
>>>>>>>> This:
>>>>>>>>
>>>>>>>> ((x) ? 32 - __builtin_clz(x) : 0)
>>>>>>>>
>>>>>>>> is a clear expression in a way that this:
>>>>>>>>
>>>>>>>> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>>>>>>>>
>>>>>>>> is not. The problem is the extra binary expression, and this:
>>>>>>>>
>>>>>>>> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>>>>>>>>
>>>>>>>> is still clear, because the reader doesn't have to perform a multiply to
>>>>>>>> just to figure out what's going on.
>>>>>>>>
>>>>>>>>
>>>>>>>> It is definitely stupid to have each architecture provide their own
>>>>>>>> BITS_PER_*. The compiler is in a superior position to provide those
>>>>>>>> details, and it should be in a common location.
>>>>>>>>
>>>>>>>> I don't particularly mind how those constants are derived, but one key
>>>>>>>> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
>>>>>>> What about moving them to xen/config.h? (if it isn't the best one place, any suggestion which is better?)
>>>>>>>
>>>>>>> #define BYTES_PER_INT (1 << INT_BYTEORDER)
>>>>>>> #define BITS_PER_INT (BYTES_PER_INT << 3)
>>>>>>>
>>>>>>> #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
>>>>>>> #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>>>>>>> #define BITS_PER_BYTE 8
>>>>> The *_BYTEORDER's are useless indirection. BITS_PER_* should be defined
>>>>> straight up, and this will simplify quite a lot of preprocessing.
>>>> Could we really drop *_BYTEORDER?
>>>>
>>>> For example, LONG_BYTEORDER for Arm could be either 2 or 3 depends on Arm32 or Arm64 is used.
>>> The can still #define BITS_PER_LONG to 32 or 64 respectively, can't they?
>> Yes, but then if we want to move it to xen/config.h then it will be needed to:
>> in Arm's asm/config.h to have something like:
>> #ifdef CONFIG_ARM_32
>> #define BITS_PER_LONG 32
>> #endif
>>
>> and then in xen/config.h
>> #ifndef BITS_PER_LONG
>> #define BITS_PER_LONG 64
>> #endif
>>
>> But I wanted to not have #ifndef BITS_PER_LONG in xen/config.h. (or using CONFIG_ARM_32 in xen/config.h)
> Whatever the fundamental definitions that can vary per arch - those should
> imo live in asm/config.h. For the case here, if we deem *_BYTEORDER as
> fundamental, they go there (and BITS_PER_* go into xen/config.h). If we deem
> BITS_PER_* fundamental, they go into asm/config.h.
>
> Stuff that we expect to remain uniform across ports (BITS_PER_BYTE, possibly
> also BITS_PER_INT BITS_PER_LLONG) may also go into xen/config.h, as long as
> we're okay with the possible future churn if a port appeared that has
> different needs.
>
> I agree that there better wouldn't be #ifndef for such in xen/config.h.
With a new toolchain baseline, we get both of these concepts directly
from the compiler environment.
I don't see any need for arch-specific overrides to these. They
ultimately come from the -march used.
~Andrew
On 06.03.2025 21:19, Andrew Cooper wrote:
> On 05/03/2025 7:34 am, Jan Beulich wrote:
>> On 28.02.2025 17:24, Andrew Cooper wrote:
>>> On 27/02/2025 8:11 am, Jan Beulich wrote:
>>>> On 26.02.2025 18:20, Andrew Cooper wrote:
>>>>> --- a/xen/arch/riscv/include/asm/bitops.h
>>>>> +++ b/xen/arch/riscv/include/asm/bitops.h
>>>>> @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p)
>>>>> #undef NOT
>>>>> #undef __AMO
>>>>>
>>>>> +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0)
>>>>> +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0)
>>>>> +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0)
>>>> I fear you won't like me to say this, but can't we avoid baking in yet
>>>> another assumption on sizeof(int) == 4, by using at least sizeof(int) * 8
>>>> here (yet better might be sizeof(int) * BITS_PER_BYTE)?
>>> Yes and no.
>>>
>>> No, because 32 here is consistent with ARM and PPC when it comes to
>>> arch_fls(). Given the effort it took to get these consistent, I'm not
>>> interested in letting them diverge.
>>>
>>> But, if someone wants to introduce BITS_PER_INT to mirror BITS_PER_LONG
>>> and use it consistently, then that would be ok too.
>
> Oleksii: I see your patch is committed, but when I said "use it
> consistently", I meant "patch ARM and PPC too".
>> I was actually hoping to eliminate BITS_PER_LONG at some point, in favor
>> of using sizeof(long) * BITS_PER_BYTE. (Surely in common code we could
>> retain a shorthand of that name, if so desired, but I see no reason why
>> each arch would need to provide all three BITS_PER_{BYTE,INT,LONG}.)
>
> The concern is legibility and clarity.
>
> This:
>
> ((x) ? 32 - __builtin_clz(x) : 0)
>
> is a clear expression in a way that this:
>
> ((x) ? (sizeof(int) * BITS_PER_BYTE) - __builtin_clz(x) : 0)
>
> is not. The problem is the extra binary expression, and this:
>
> ((x) ? BITS_PER_INT - __builtin_clz(x) : 0)
>
> is still clear, because the reader doesn't have to perform a multiply to
> just to figure out what's going on.
>
>
> It is definitely stupid to have each architecture provide their own
> BITS_PER_*. The compiler is in a superior position to provide those
> details, and it should be in a common location.
>
> I don't particularly mind how those constants are derived, but one key
> thing that BITS_PER_* can do which sizeof() can't is be used in #ifdef/etc.
This is a fair point indeed.
Jan
On 2/26/25 6:20 PM, Andrew Cooper wrote: > Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> > --- > CC: Oleksii Kurochko<oleksii.kurochko@gmail.com> > > Depends on "xen/riscv: make zbb as mandatory" > --- > xen/arch/riscv/include/asm/bitops.h | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/xen/arch/riscv/include/asm/bitops.h b/xen/arch/riscv/include/asm/bitops.h > index d22eec1e87c7..df3df93520c5 100644 > --- a/xen/arch/riscv/include/asm/bitops.h > +++ b/xen/arch/riscv/include/asm/bitops.h > @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p) > #undef NOT > #undef __AMO > > +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0) > +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0) > +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0) > +#define arch_flsl(x) ((x) ? BITS_PER_LONG - __builtin_clzl(x) : 0) > + > +#define arch_heightl(x) __builtin_popcountl(x) > + > #endif /* ASM__RISCV__BITOPS_H */ > > /* LGRM: Reviewed-by: Oleksii Kurochko<oleksii.kurochko@gmail.com> Thanks. > base-commit: 7cf163879c5add0a4f7f9c987b61f04f8f7051b1 > prerequisite-patch-id: 9ee1f7ebf5d34b1c565ee2d3d4fb319164bb8bcd > prerequisite-patch-id: 8a05c87c8d051a3ac0820887f676bbd318e4ae88 > prerequisite-patch-id: 6b56e42d130d8b5ee39457b6760b05cc6e16b049 > prerequisite-patch-id: c139f1f5741d695cd5e5aa6be904edcb61b73885 > prerequisite-patch-id: 99f8b701000e9ee11060934e627a988ddf9aaaa7 Could you please tell me how do you generate this one? ~ Oleksii
On 26/02/2025 9:01 pm, Oleksii Kurochko wrote: > > > On 2/26/25 6:20 PM, Andrew Cooper wrote: >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> --- >> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> >> Depends on "xen/riscv: make zbb as mandatory" >> --- >> xen/arch/riscv/include/asm/bitops.h | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/xen/arch/riscv/include/asm/bitops.h b/xen/arch/riscv/include/asm/bitops.h >> index d22eec1e87c7..df3df93520c5 100644 >> --- a/xen/arch/riscv/include/asm/bitops.h >> +++ b/xen/arch/riscv/include/asm/bitops.h >> @@ -125,6 +125,13 @@ static inline void clear_bit(int nr, volatile void *p) >> #undef NOT >> #undef __AMO >> >> +#define arch_ffs(x) ((x) ? 1 + __builtin_ctz(x) : 0) >> +#define arch_ffsl(x) ((x) ? 1 + __builtin_ctzl(x) : 0) >> +#define arch_fls(x) ((x) ? 32 - __builtin_clz(x) : 0) >> +#define arch_flsl(x) ((x) ? BITS_PER_LONG - __builtin_clzl(x) : 0) >> + >> +#define arch_heightl(x) __builtin_popcountl(x) >> + >> #endif /* ASM__RISCV__BITOPS_H */ >> >> /* > LGRM: Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Thanks. This form is much easier than making alternatives for it. > Thanks. >> base-commit: 7cf163879c5add0a4f7f9c987b61f04f8f7051b1 >> prerequisite-patch-id: 9ee1f7ebf5d34b1c565ee2d3d4fb319164bb8bcd >> prerequisite-patch-id: 8a05c87c8d051a3ac0820887f676bbd318e4ae88 >> prerequisite-patch-id: 6b56e42d130d8b5ee39457b6760b05cc6e16b049 >> prerequisite-patch-id: c139f1f5741d695cd5e5aa6be904edcb61b73885 >> prerequisite-patch-id: 99f8b701000e9ee11060934e627a988ddf9aaaa7 > Could you please tell me how do you generate this one? In gitconfig, [format] useAutoBase = "whenAble" or --base=auto on a git format-patch command. This is a poor example. Those prereq ids are: * 307e136282d8 - (HEAD -> riscv-isa) RISCV/bitops: Use Zbb to provide arch-optimised bitops (7 hours ago) <Andrew Cooper> * 519dcd50e4cd - xen/riscv: make zbb as mandatory (12 hours ago) <Oleksii Kurochko> * c9bf2a9ac22c - xen/riscv: identify specific ISA supported by cpu (12 hours ago) <Oleksii Kurochko> * 9014de63aa14 - automation: drop debian:11-riscv64 container (12 hours ago) <Oleksii Kurochko> * 5febb98d11f3 - xen/riscv: drop CONFIG_RISCV_ISA_RV64G (12 hours ago) <Oleksii Kurochko> * 5a7c9fd746af - xen/README: add compiler and binutils versions for RISCV-64 (12 hours ago) <Oleksii Kurochko> * 7cf163879c5a - (xenbits/staging, xenbits/master, upstream/staging, upstream/master, origin/staging, origin/master, origin/HEAD, staging, pending, master) PPC: Activate UBSAN in testing (12 hours ago) <Andrew Cooper> as I pulled your series in locally. ~Andrew
© 2016 - 2025 Red Hat, Inc.