lr_mask doesn't have bits set beyond the hardware limit, with the upper bits
remaining zero. Therefore, for_each_set_bit() is a better option.
For ARM64, bloat-o-meter reports:
Function old new delta
vgic_sync_from_lrs 208 168 -40
but this doesn't highlight that it also removes a call to find_next_bit() from
each loop iteration.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
RFC. This form also doesn't suffer an OoB read when lr_mask changes type.
---
xen/arch/arm/gic-vgic.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index ea48c5375a91..fae80e6cd293 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -241,9 +241,7 @@ static void gic_update_one_lr(struct vcpu *v, int i)
void vgic_sync_from_lrs(struct vcpu *v)
{
- int i = 0;
unsigned long flags;
- unsigned int nr_lrs = gic_get_nr_lrs();
/* The idle domain has no LRs to be cleared. Since gic_restore_state
* doesn't write any LR registers for the idle domain they could be
@@ -255,11 +253,8 @@ void vgic_sync_from_lrs(struct vcpu *v)
spin_lock_irqsave(&v->arch.vgic.lock, flags);
- while ((i = find_next_bit((const unsigned long *) &this_cpu(lr_mask),
- nr_lrs, i)) < nr_lrs ) {
+ for_each_set_bit ( i, this_cpu(lr_mask) )
gic_update_one_lr(v, i);
- i++;
- }
spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
}
base-commit: bdd49cc2f61510797a47ad81486be653633ab3ee
--
2.39.5
Hi, On 05/03/2026 23:28, Andrew Cooper wrote: > lr_mask doesn't have bits set beyond the hardware limit, with the upper bits > remaining zero. Therefore, for_each_set_bit() is a better option. > > For ARM64, bloat-o-meter reports: > > Function old new delta > vgic_sync_from_lrs 208 168 -40 > > but this doesn't highlight that it also removes a call to find_next_bit() from > each loop iteration. > > No functional change. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Julien Grall <julien@xen.org> > --- > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Julien Grall <julien@xen.org> > CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> > CC: Bertrand Marquis <bertrand.marquis@arm.com> > CC: Michal Orzel <michal.orzel@amd.com> > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Ayan Kumar Halder <ayan.kumar.halder@amd.com> > > RFC. This form also doesn't suffer an OoB read when lr_mask changes type. And potentially unaligned as well. On Arm 32, this would result to a crash because for forbid unaligned access. Cheers, -- Julien Grall
On 06/03/2026 00:28, Andrew Cooper wrote: > lr_mask doesn't have bits set beyond the hardware limit, with the upper bits > remaining zero. Therefore, for_each_set_bit() is a better option. > > For ARM64, bloat-o-meter reports: > > Function old new delta > vgic_sync_from_lrs 208 168 -40 > > but this doesn't highlight that it also removes a call to find_next_bit() from > each loop iteration. > > No functional change. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Michal Orzel <michal.orzel@amd.com> ~Michal
© 2016 - 2026 Red Hat, Inc.