The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same
can be achieved with an atomic increment, which is both simpler to read, and
avoid any need for a loop.
Note there can be a small divergence in the channel returned if next_channel
overflows, but returned channel will always be in the [0, num_hpets_used)
range, and that's fine for the purpose of balancing HPET channels across CPUs.
This is also theoretical, as there's no system currently with 2^32 CPUs (as
long as next_channel is 32bit width).
Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/hpet.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index d982b0f6b2c9..4777dc859d96 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -458,11 +458,7 @@ static struct hpet_event_channel *hpet_get_channel(unsigned int cpu)
if ( num_hpets_used >= nr_cpu_ids )
return &hpet_events[cpu];
- do {
- next = next_channel;
- if ( (i = next + 1) == num_hpets_used )
- i = 0;
- } while ( cmpxchg(&next_channel, next, i) != next );
+ next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used;
/* try unused channel first */
for ( i = next; i < next + num_hpets_used; i++ )
--
2.43.0
On 22.02.2024 10:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > Note there can be a small divergence in the channel returned if next_channel > overflows, but returned channel will always be in the [0, num_hpets_used) > range, and that's fine for the purpose of balancing HPET channels across CPUs. > This is also theoretical, as there's no system currently with 2^32 CPUs (as > long as next_channel is 32bit width). > > Signed-of-by: Roger Pau Monné <roger.pau@citrix.com> Hmm, I'm sorry - it's now me who is recorded as the author of the patch, for my script not finding any Signed-off-by: here (note the typo). Jan
Hi Roger, On 22/02/2024 09:05, Roger Pau Monne wrote: > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > can be achieved with an atomic increment, which is both simpler to read, and > avoid any need for a loop. > > Note there can be a small divergence in the channel returned if next_channel > overflows, but returned channel will always be in the [0, num_hpets_used) > range, and that's fine for the purpose of balancing HPET channels across CPUs. > This is also theoretical, as there's no system currently with 2^32 CPUs (as > long as next_channel is 32bit width). This would also happen if we decide to reduce the size next_channel. Reviewed-by: Julien Grall <jgrall@amazon.com> Cheers, -- Julien Grall
On 22.02.2024 10:05, Roger Pau Monne wrote:
> The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same
> can be achieved with an atomic increment, which is both simpler to read, and
> avoid any need for a loop.
>
> Note there can be a small divergence in the channel returned if next_channel
> overflows, but returned channel will always be in the [0, num_hpets_used)
> range, and that's fine for the purpose of balancing HPET channels across CPUs.
> This is also theoretical, as there's no system currently with 2^32 CPUs (as
> long as next_channel is 32bit width).
The code change looks good to me, but I fail to see the connection to
2^32 CPUs. So it feels like I'm missing something, in which case I'd
rather not offer any R-b.
Jan
> Signed-of-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> xen/arch/x86/hpet.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
> index d982b0f6b2c9..4777dc859d96 100644
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -458,11 +458,7 @@ static struct hpet_event_channel *hpet_get_channel(unsigned int cpu)
> if ( num_hpets_used >= nr_cpu_ids )
> return &hpet_events[cpu];
>
> - do {
> - next = next_channel;
> - if ( (i = next + 1) == num_hpets_used )
> - i = 0;
> - } while ( cmpxchg(&next_channel, next, i) != next );
> + next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used;
>
> /* try unused channel first */
> for ( i = next; i < next + num_hpets_used; i++ )
On Thu, Feb 22, 2024 at 11:10:54AM +0100, Jan Beulich wrote: > On 22.02.2024 10:05, Roger Pau Monne wrote: > > The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same > > can be achieved with an atomic increment, which is both simpler to read, and > > avoid any need for a loop. > > > > Note there can be a small divergence in the channel returned if next_channel > > overflows, but returned channel will always be in the [0, num_hpets_used) > > range, and that's fine for the purpose of balancing HPET channels across CPUs. > > This is also theoretical, as there's no system currently with 2^32 CPUs (as > > long as next_channel is 32bit width). > > The code change looks good to me, but I fail to see the connection to > 2^32 CPUs. So it feels like I'm missing something, in which case I'd > rather not offer any R-b. The purpose of hpet_get_channel() is to distribute HPET channels across CPUs, so that each CPU gets assigned an HPET channel, balancing the number of CPUs that use each channel. This is done by (next_channel++ % num_hpets_used), so the value of next_channel after this change can be > num_hpets_used, which previously wasn't. This can lead to a different channel returned for the 2^32 call to the function, as the counter would overflow. Note calls to the function are restricted to the number of CPUs in the host, as per-CPU channel assignment is done only once (and the channel is then stored in a per-cpu variable). Feel free to adjust the commit message if you think all this is too much data, or too confusing. Thanks, Roger.
On 22.02.2024 11:58, Roger Pau Monné wrote: > On Thu, Feb 22, 2024 at 11:10:54AM +0100, Jan Beulich wrote: >> On 22.02.2024 10:05, Roger Pau Monne wrote: >>> The usage of a cmpxchg loop in hpet_get_channel() is unnecessary, as the same >>> can be achieved with an atomic increment, which is both simpler to read, and >>> avoid any need for a loop. >>> >>> Note there can be a small divergence in the channel returned if next_channel >>> overflows, but returned channel will always be in the [0, num_hpets_used) >>> range, and that's fine for the purpose of balancing HPET channels across CPUs. >>> This is also theoretical, as there's no system currently with 2^32 CPUs (as >>> long as next_channel is 32bit width). >> >> The code change looks good to me, but I fail to see the connection to >> 2^32 CPUs. So it feels like I'm missing something, in which case I'd >> rather not offer any R-b. > > The purpose of hpet_get_channel() is to distribute HPET channels > across CPUs, so that each CPU gets assigned an HPET channel, balancing > the number of CPUs that use each channel. > > This is done by (next_channel++ % num_hpets_used), so the value of > next_channel after this change can be > num_hpets_used, which > previously wasn't. This can lead to a different channel returned for > the 2^32 call to the function, as the counter would overflow. Note > calls to the function are restricted to the number of CPUs in the > host, as per-CPU channel assignment is done only once (and the channel > is then stored in a per-cpu variable). That's once per CPU being brought up, not once per CPU in the system. > Feel free to adjust the commit message if you think all this is too > much data, or too confusing. I'll simply drop that last sentence then, without which Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan
© 2016 - 2026 Red Hat, Inc.