[PATCH] ARM/vgic: Use for_each_set_bit() in vgic_sync_from_lrs()

Andrew Cooper posted 1 patch 1 month, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20260305232845.62024-1-andrew.cooper3@citrix.com
xen/arch/arm/gic-vgic.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
[PATCH] ARM/vgic: Use for_each_set_bit() in vgic_sync_from_lrs()
Posted by Andrew Cooper 1 month, 1 week ago
lr_mask doesn't have bits set beyond the hardware limit, with the upper bits
remaining zero.  Therefore, for_each_set_bit() is a better option.

For ARM64, bloat-o-meter reports:

  Function                                     old     new   delta
  vgic_sync_from_lrs                           208     168     -40

but this doesn't highlight that it also removes a call to find_next_bit() from
each loop iteration.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Ayan Kumar Halder <ayan.kumar.halder@amd.com>

RFC.  This form also doesn't suffer an OoB read when lr_mask changes type.
---
 xen/arch/arm/gic-vgic.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index ea48c5375a91..fae80e6cd293 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -241,9 +241,7 @@ static void gic_update_one_lr(struct vcpu *v, int i)
 
 void vgic_sync_from_lrs(struct vcpu *v)
 {
-    int i = 0;
     unsigned long flags;
-    unsigned int nr_lrs = gic_get_nr_lrs();
 
     /* The idle domain has no LRs to be cleared. Since gic_restore_state
      * doesn't write any LR registers for the idle domain they could be
@@ -255,11 +253,8 @@ void vgic_sync_from_lrs(struct vcpu *v)
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
 
-    while ((i = find_next_bit((const unsigned long *) &this_cpu(lr_mask),
-                              nr_lrs, i)) < nr_lrs ) {
+    for_each_set_bit ( i, this_cpu(lr_mask) )
         gic_update_one_lr(v, i);
-        i++;
-    }
 
     spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
 }

base-commit: bdd49cc2f61510797a47ad81486be653633ab3ee
-- 
2.39.5


Re: [PATCH] ARM/vgic: Use for_each_set_bit() in vgic_sync_from_lrs()
Posted by Julien Grall 1 month, 1 week ago
Hi,

On 05/03/2026 23:28, Andrew Cooper wrote:
> lr_mask doesn't have bits set beyond the hardware limit, with the upper bits
> remaining zero.  Therefore, for_each_set_bit() is a better option.
> 
> For ARM64, bloat-o-meter reports:
> 
>    Function                                     old     new   delta
>    vgic_sync_from_lrs                           208     168     -40
> 
> but this doesn't highlight that it also removes a call to find_next_bit() from
> each loop iteration.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Julien Grall <julien@xen.org>

> ---
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Michal Orzel <michal.orzel@amd.com>
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
> 
> RFC.  This form also doesn't suffer an OoB read when lr_mask changes type.

And potentially unaligned as well. On Arm 32, this would result to a 
crash because for forbid unaligned access.

Cheers,

-- 
Julien Grall


Re: [PATCH] ARM/vgic: Use for_each_set_bit() in vgic_sync_from_lrs()
Posted by Orzel, Michal 1 month, 1 week ago

On 06/03/2026 00:28, Andrew Cooper wrote:
> lr_mask doesn't have bits set beyond the hardware limit, with the upper bits
> remaining zero.  Therefore, for_each_set_bit() is a better option.
> 
> For ARM64, bloat-o-meter reports:
> 
>   Function                                     old     new   delta
>   vgic_sync_from_lrs                           208     168     -40
> 
> but this doesn't highlight that it also removes a call to find_next_bit() from
> each loop iteration.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

~Michal