x86/irq: fix calculation of maximum pIRQs for dom0

[PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Roger Pau Monne 1 month ago

The current calculation of PV dom0 pIRQs uses:

n = min(fls(num_present_cpus()), dom0_max_vcpus());

The usage of fls() is wrong, as num_present_cpus() already returns the number
of present CPUs, not the bitmap mask of CPUs.

Fix by removing the usage of fls().

Fixes: 7e73a6e7f12a ('have architectures specify the number of PIRQs a hardware domain gets')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/io_apic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index d44d2c9a4173..bd5ad61c85e4 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -2744,7 +2744,7 @@ void __init ioapic_init(void)
 
 unsigned int __hwdom_init arch_hwdom_irqs(const struct domain *d)
 {
-    unsigned int n = fls(num_present_cpus());
+    unsigned int n = num_present_cpus();
     /* Bounding by the domain pirq EOI bitmap capacity. */
     const unsigned int max_irqs = min_t(unsigned int, nr_irqs,
                                         PAGE_SIZE * BITS_PER_BYTE);
-- 
2.46.0

Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Jan Beulich 1 month ago

On 20.11.2024 12:35, Roger Pau Monne wrote:
> The current calculation of PV dom0 pIRQs uses:
> 
> n = min(fls(num_present_cpus()), dom0_max_vcpus());
> 
> The usage of fls() is wrong, as num_present_cpus() already returns the number
> of present CPUs, not the bitmap mask of CPUs.

Hmm. Perhaps that use of fls() should have been accompanied by a comment, but
I think it might have been put there intentionally, to avoid linear growth.
Which isn't to say that I mind the adjustment, especially now that we don't
use any clustered modes anymore for I/O interrupts. I'm merely questioning
the Fixes: tag, and with that whether / how far to backport.

Jan

Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Roger Pau Monné 1 month ago

On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote:
> On 20.11.2024 12:35, Roger Pau Monne wrote:
> > The current calculation of PV dom0 pIRQs uses:
> > 
> > n = min(fls(num_present_cpus()), dom0_max_vcpus());
> > 
> > The usage of fls() is wrong, as num_present_cpus() already returns the number
> > of present CPUs, not the bitmap mask of CPUs.
> 
> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but
> I think it might have been put there intentionally, to avoid linear growth.
> Which isn't to say that I mind the adjustment, especially now that we don't
> use any clustered modes anymore for I/O interrupts. I'm merely questioning
> the Fixes: tag, and with that whether / how far to backport.

Hm, sorry I've assumed the fls() was a typo.  It seems wrong to cap
dom0 vCPUs with the fls of the present CPUs number.  For consistency,
if the intention was to use fls to limit growth, I would have expected
to also be applied to the dom0 number of vCPUs.  And a comment would
have been nice indeed :).

In any case this is hurting XenServer now: we got reports of pIRQ
exhaustion on some systems.

Thanks, Roger.

Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Jan Beulich 1 month ago

On 21.11.2024 12:04, Roger Pau Monné wrote:
> On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote:
>> On 20.11.2024 12:35, Roger Pau Monne wrote:
>>> The current calculation of PV dom0 pIRQs uses:
>>>
>>> n = min(fls(num_present_cpus()), dom0_max_vcpus());
>>>
>>> The usage of fls() is wrong, as num_present_cpus() already returns the number
>>> of present CPUs, not the bitmap mask of CPUs.
>>
>> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but
>> I think it might have been put there intentionally, to avoid linear growth.
>> Which isn't to say that I mind the adjustment, especially now that we don't
>> use any clustered modes anymore for I/O interrupts. I'm merely questioning
>> the Fixes: tag, and with that whether / how far to backport.
> 
> Hm, sorry I've assumed the fls() was a typo.  It seems wrong to cap
> dom0 vCPUs with the fls of the present CPUs number.  For consistency,
> if the intention was to use fls to limit growth, I would have expected
> to also be applied to the dom0 number of vCPUs.

FTR: My vague recollection (it has been nearly 10 years) is that I first had
it there, too. Until I realized that it hardly ever would have any effect,
because of the min(). And for Dom0-s with extremely few vCPU-s it seemed
reasonable to not apply that cap there.

Jan

Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Roger Pau Monné 1 month ago

On Thu, Nov 21, 2024 at 12:39:06PM +0100, Jan Beulich wrote:
> On 21.11.2024 12:04, Roger Pau Monné wrote:
> > On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote:
> >> On 20.11.2024 12:35, Roger Pau Monne wrote:
> >>> The current calculation of PV dom0 pIRQs uses:
> >>>
> >>> n = min(fls(num_present_cpus()), dom0_max_vcpus());
> >>>
> >>> The usage of fls() is wrong, as num_present_cpus() already returns the number
> >>> of present CPUs, not the bitmap mask of CPUs.
> >>
> >> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but
> >> I think it might have been put there intentionally, to avoid linear growth.
> >> Which isn't to say that I mind the adjustment, especially now that we don't
> >> use any clustered modes anymore for I/O interrupts. I'm merely questioning
> >> the Fixes: tag, and with that whether / how far to backport.
> > 
> > Hm, sorry I've assumed the fls() was a typo.  It seems wrong to cap
> > dom0 vCPUs with the fls of the present CPUs number.  For consistency,
> > if the intention was to use fls to limit growth, I would have expected
> > to also be applied to the dom0 number of vCPUs.
> 
> FTR: My vague recollection (it has been nearly 10 years) is that I first had
> it there, too. Until I realized that it hardly ever would have any effect,
> because of the min(). And for Dom0-s with extremely few vCPU-s it seemed
> reasonable to not apply that cap there.

I don't have a strong opinion regarding the fixes tag, but would like
to get this sorted as soon as possible.  If you prefer just drop the
fixes tag.  I think this wants backporting to all supported releases,
because it's an issue XenServer has hit on real servers when dom0 is
sized to use 16 vCPUs at most.

Thanks, Roger.

Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Jan Beulich 1 month ago

On 22.11.2024 09:01, Roger Pau Monné wrote:
> On Thu, Nov 21, 2024 at 12:39:06PM +0100, Jan Beulich wrote:
>> On 21.11.2024 12:04, Roger Pau Monné wrote:
>>> On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote:
>>>> On 20.11.2024 12:35, Roger Pau Monne wrote:
>>>>> The current calculation of PV dom0 pIRQs uses:
>>>>>
>>>>> n = min(fls(num_present_cpus()), dom0_max_vcpus());
>>>>>
>>>>> The usage of fls() is wrong, as num_present_cpus() already returns the number
>>>>> of present CPUs, not the bitmap mask of CPUs.
>>>>
>>>> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but
>>>> I think it might have been put there intentionally, to avoid linear growth.
>>>> Which isn't to say that I mind the adjustment, especially now that we don't
>>>> use any clustered modes anymore for I/O interrupts. I'm merely questioning
>>>> the Fixes: tag, and with that whether / how far to backport.
>>>
>>> Hm, sorry I've assumed the fls() was a typo.  It seems wrong to cap
>>> dom0 vCPUs with the fls of the present CPUs number.  For consistency,
>>> if the intention was to use fls to limit growth, I would have expected
>>> to also be applied to the dom0 number of vCPUs.
>>
>> FTR: My vague recollection (it has been nearly 10 years) is that I first had
>> it there, too. Until I realized that it hardly ever would have any effect,
>> because of the min(). And for Dom0-s with extremely few vCPU-s it seemed
>> reasonable to not apply that cap there.
> 
> I don't have a strong opinion regarding the fixes tag, but would like
> to get this sorted as soon as possible.  If you prefer just drop the
> fixes tag.  I think this wants backporting to all supported releases,
> because it's an issue XenServer has hit on real servers when dom0 is
> sized to use 16 vCPUs at most.

In which case yes, I guess we want to keep the Fixes:.

Jan

Re: [PATCH 1/2] x86/irq: fix calculation of max PV dom0 pIRQs

Posted by Andrew Cooper 1 month ago

On 20/11/2024 11:35 am, Roger Pau Monne wrote:
> The current calculation of PV dom0 pIRQs uses:
>
> n = min(fls(num_present_cpus()), dom0_max_vcpus());
>
> The usage of fls() is wrong, as num_present_cpus() already returns the number
> of present CPUs, not the bitmap mask of CPUs.
>
> Fix by removing the usage of fls().
>
> Fixes: 7e73a6e7f12a ('have architectures specify the number of PIRQs a hardware domain gets')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Yeah, that fls() fails the dimensional analysis sniff test.

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Is there any hint as to what the reasoning was?

~Andrew