When adjusting move_cleanup_count to account for CPUs that are offline also
adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract
those again creating and create an imbalance in move_cleanup_count.
Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/irq.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index c16205a9beb6..9716e00e873b 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
desc->arch.move_cleanup_count -= cpumask_weight(affinity);
if ( !desc->arch.move_cleanup_count )
release_old_vec(desc);
+ else
+ /*
+ * Adjust old_cpu_mask to account for the offline CPUs,
+ * otherwise further calls to fixup_irqs() could subtract those
+ * again and possibly underflow the counter.
+ */
+ cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
+ &cpu_online_map);
}
if ( !desc->action || cpumask_subset(desc->affinity, mask) )
--
2.44.0
On 29.05.2024 11:01, Roger Pau Monne wrote:
> When adjusting move_cleanup_count to account for CPUs that are offline also
> adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract
> those again creating and create an imbalance in move_cleanup_count.
I'm in trouble with "creating"; I can't seem to be able to guess what you may
have meant.
> Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
With the above clarified (adjustment can be done while committing)
Reviewed-by: Jan Beulich <jbeulich@suse.com>
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> desc->arch.move_cleanup_count -= cpumask_weight(affinity);
> if ( !desc->arch.move_cleanup_count )
> release_old_vec(desc);
> + else
> + /*
> + * Adjust old_cpu_mask to account for the offline CPUs,
> + * otherwise further calls to fixup_irqs() could subtract those
> + * again and possibly underflow the counter.
> + */
> + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
> + &cpu_online_map);
> }
While functionality-wise okay, imo it would be slightly better to use
"affinity" here as well, so that even without looking at context beyond
what's shown here there is a direct connection to the cpumask_weight()
call. I.e.
cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
affinity);
Thoughts?
Jan
On Wed, May 29, 2024 at 02:40:51PM +0200, Jan Beulich wrote:
> On 29.05.2024 11:01, Roger Pau Monne wrote:
> > When adjusting move_cleanup_count to account for CPUs that are offline also
> > adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract
> > those again creating and create an imbalance in move_cleanup_count.
>
> I'm in trouble with "creating"; I can't seem to be able to guess what you may
> have meant.
Oh, sorry, that's a typo.
I was meaning to point out that not removing the already subtracted
CPUs from the mask can lead to further calls to fixup_irqs()
subtracting them again and move_cleanup_count possibly underflowing.
Would you prefer to write it as:
"... could subtract those again and possibly underflow move_cleanup_count."
> > Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>
> With the above clarified (adjustment can be done while committing)
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> > desc->arch.move_cleanup_count -= cpumask_weight(affinity);
> > if ( !desc->arch.move_cleanup_count )
> > release_old_vec(desc);
> > + else
> > + /*
> > + * Adjust old_cpu_mask to account for the offline CPUs,
> > + * otherwise further calls to fixup_irqs() could subtract those
> > + * again and possibly underflow the counter.
> > + */
> > + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
> > + &cpu_online_map);
> > }
>
> While functionality-wise okay, imo it would be slightly better to use
> "affinity" here as well, so that even without looking at context beyond
> what's shown here there is a direct connection to the cpumask_weight()
> call. I.e.
>
> cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
> affinity);
>
> Thoughts?
It was more straightforward for me to reason that removing the offline
CPUs is OK, but I can see that you might prefer to use 'affinity',
because that's the weight that's subtracted from move_cleanup_count.
Using either should lead to the same result if my understanding is
correct.
Thanks, Roger.
On 29.05.2024 17:15, Roger Pau Monné wrote: > On Wed, May 29, 2024 at 02:40:51PM +0200, Jan Beulich wrote: >> On 29.05.2024 11:01, Roger Pau Monne wrote: >>> When adjusting move_cleanup_count to account for CPUs that are offline also >>> adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract >>> those again creating and create an imbalance in move_cleanup_count. >> >> I'm in trouble with "creating"; I can't seem to be able to guess what you may >> have meant. > > Oh, sorry, that's a typo. > > I was meaning to point out that not removing the already subtracted > CPUs from the mask can lead to further calls to fixup_irqs() > subtracting them again and move_cleanup_count possibly underflowing. > > Would you prefer to write it as: > > "... could subtract those again and possibly underflow move_cleanup_count." Fine with me. Looks like simply deleting "creating" and keeping the rest as it was would be okay too? Whatever you prefer in the end. >>> --- a/xen/arch/x86/irq.c >>> +++ b/xen/arch/x86/irq.c >>> @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) >>> desc->arch.move_cleanup_count -= cpumask_weight(affinity); >>> if ( !desc->arch.move_cleanup_count ) >>> release_old_vec(desc); >>> + else >>> + /* >>> + * Adjust old_cpu_mask to account for the offline CPUs, >>> + * otherwise further calls to fixup_irqs() could subtract those >>> + * again and possibly underflow the counter. >>> + */ >>> + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, >>> + &cpu_online_map); >>> } >> >> While functionality-wise okay, imo it would be slightly better to use >> "affinity" here as well, so that even without looking at context beyond >> what's shown here there is a direct connection to the cpumask_weight() >> call. I.e. >> >> cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, >> affinity); >> >> Thoughts? > > It was more straightforward for me to reason that removing the offline > CPUs is OK, but I can see that you might prefer to use 'affinity', > because that's the weight that's subtracted from move_cleanup_count. > Using either should lead to the same result if my understanding is > correct. That was the conclusion I came to, or else I wouldn't have made the suggestion. Unless you have a strong preference for the as-is form, I'd indeed prefer the suggested alternative. Jan
On Wed, May 29, 2024 at 05:27:06PM +0200, Jan Beulich wrote: > On 29.05.2024 17:15, Roger Pau Monné wrote: > > On Wed, May 29, 2024 at 02:40:51PM +0200, Jan Beulich wrote: > >> On 29.05.2024 11:01, Roger Pau Monne wrote: > >>> When adjusting move_cleanup_count to account for CPUs that are offline also > >>> adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract > >>> those again creating and create an imbalance in move_cleanup_count. > >> > >> I'm in trouble with "creating"; I can't seem to be able to guess what you may > >> have meant. > > > > Oh, sorry, that's a typo. > > > > I was meaning to point out that not removing the already subtracted > > CPUs from the mask can lead to further calls to fixup_irqs() > > subtracting them again and move_cleanup_count possibly underflowing. > > > > Would you prefer to write it as: > > > > "... could subtract those again and possibly underflow move_cleanup_count." > > Fine with me. Looks like simply deleting "creating" and keeping the rest > as it was would be okay too? Whatever you prefer in the end. Yes, whatever you think it's clearer TBH, I don't really have a preference. Thanks, Roger.
© 2016 - 2025 Red Hat, Inc.