The current calculation of PV dom0 pIRQs uses:
n = min(fls(num_present_cpus()), dom0_max_vcpus());
The usage of fls() is wrong, as num_present_cpus() already returns the number
of present CPUs, not the bitmap mask of CPUs.
Fix by removing the usage of fls().
Fixes: 7e73a6e7f12a ('have architectures specify the number of PIRQs a hardware domain gets')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/io_apic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index d44d2c9a4173..bd5ad61c85e4 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -2744,7 +2744,7 @@ void __init ioapic_init(void)
unsigned int __hwdom_init arch_hwdom_irqs(const struct domain *d)
{
- unsigned int n = fls(num_present_cpus());
+ unsigned int n = num_present_cpus();
/* Bounding by the domain pirq EOI bitmap capacity. */
const unsigned int max_irqs = min_t(unsigned int, nr_irqs,
PAGE_SIZE * BITS_PER_BYTE);
--
2.46.0
On 20.11.2024 12:35, Roger Pau Monne wrote: > The current calculation of PV dom0 pIRQs uses: > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > The usage of fls() is wrong, as num_present_cpus() already returns the number > of present CPUs, not the bitmap mask of CPUs. Hmm. Perhaps that use of fls() should have been accompanied by a comment, but I think it might have been put there intentionally, to avoid linear growth. Which isn't to say that I mind the adjustment, especially now that we don't use any clustered modes anymore for I/O interrupts. I'm merely questioning the Fixes: tag, and with that whether / how far to backport. Jan
On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: > On 20.11.2024 12:35, Roger Pau Monne wrote: > > The current calculation of PV dom0 pIRQs uses: > > > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > > > The usage of fls() is wrong, as num_present_cpus() already returns the number > > of present CPUs, not the bitmap mask of CPUs. > > Hmm. Perhaps that use of fls() should have been accompanied by a comment, but > I think it might have been put there intentionally, to avoid linear growth. > Which isn't to say that I mind the adjustment, especially now that we don't > use any clustered modes anymore for I/O interrupts. I'm merely questioning > the Fixes: tag, and with that whether / how far to backport. Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap dom0 vCPUs with the fls of the present CPUs number. For consistency, if the intention was to use fls to limit growth, I would have expected to also be applied to the dom0 number of vCPUs. And a comment would have been nice indeed :). In any case this is hurting XenServer now: we got reports of pIRQ exhaustion on some systems. Thanks, Roger.
On 21.11.2024 12:04, Roger Pau Monné wrote: > On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: >> On 20.11.2024 12:35, Roger Pau Monne wrote: >>> The current calculation of PV dom0 pIRQs uses: >>> >>> n = min(fls(num_present_cpus()), dom0_max_vcpus()); >>> >>> The usage of fls() is wrong, as num_present_cpus() already returns the number >>> of present CPUs, not the bitmap mask of CPUs. >> >> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but >> I think it might have been put there intentionally, to avoid linear growth. >> Which isn't to say that I mind the adjustment, especially now that we don't >> use any clustered modes anymore for I/O interrupts. I'm merely questioning >> the Fixes: tag, and with that whether / how far to backport. > > Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap > dom0 vCPUs with the fls of the present CPUs number. For consistency, > if the intention was to use fls to limit growth, I would have expected > to also be applied to the dom0 number of vCPUs. FTR: My vague recollection (it has been nearly 10 years) is that I first had it there, too. Until I realized that it hardly ever would have any effect, because of the min(). And for Dom0-s with extremely few vCPU-s it seemed reasonable to not apply that cap there. Jan
On Thu, Nov 21, 2024 at 12:39:06PM +0100, Jan Beulich wrote: > On 21.11.2024 12:04, Roger Pau Monné wrote: > > On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: > >> On 20.11.2024 12:35, Roger Pau Monne wrote: > >>> The current calculation of PV dom0 pIRQs uses: > >>> > >>> n = min(fls(num_present_cpus()), dom0_max_vcpus()); > >>> > >>> The usage of fls() is wrong, as num_present_cpus() already returns the number > >>> of present CPUs, not the bitmap mask of CPUs. > >> > >> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but > >> I think it might have been put there intentionally, to avoid linear growth. > >> Which isn't to say that I mind the adjustment, especially now that we don't > >> use any clustered modes anymore for I/O interrupts. I'm merely questioning > >> the Fixes: tag, and with that whether / how far to backport. > > > > Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap > > dom0 vCPUs with the fls of the present CPUs number. For consistency, > > if the intention was to use fls to limit growth, I would have expected > > to also be applied to the dom0 number of vCPUs. > > FTR: My vague recollection (it has been nearly 10 years) is that I first had > it there, too. Until I realized that it hardly ever would have any effect, > because of the min(). And for Dom0-s with extremely few vCPU-s it seemed > reasonable to not apply that cap there. I don't have a strong opinion regarding the fixes tag, but would like to get this sorted as soon as possible. If you prefer just drop the fixes tag. I think this wants backporting to all supported releases, because it's an issue XenServer has hit on real servers when dom0 is sized to use 16 vCPUs at most. Thanks, Roger.
On 22.11.2024 09:01, Roger Pau Monné wrote: > On Thu, Nov 21, 2024 at 12:39:06PM +0100, Jan Beulich wrote: >> On 21.11.2024 12:04, Roger Pau Monné wrote: >>> On Thu, Nov 21, 2024 at 11:49:44AM +0100, Jan Beulich wrote: >>>> On 20.11.2024 12:35, Roger Pau Monne wrote: >>>>> The current calculation of PV dom0 pIRQs uses: >>>>> >>>>> n = min(fls(num_present_cpus()), dom0_max_vcpus()); >>>>> >>>>> The usage of fls() is wrong, as num_present_cpus() already returns the number >>>>> of present CPUs, not the bitmap mask of CPUs. >>>> >>>> Hmm. Perhaps that use of fls() should have been accompanied by a comment, but >>>> I think it might have been put there intentionally, to avoid linear growth. >>>> Which isn't to say that I mind the adjustment, especially now that we don't >>>> use any clustered modes anymore for I/O interrupts. I'm merely questioning >>>> the Fixes: tag, and with that whether / how far to backport. >>> >>> Hm, sorry I've assumed the fls() was a typo. It seems wrong to cap >>> dom0 vCPUs with the fls of the present CPUs number. For consistency, >>> if the intention was to use fls to limit growth, I would have expected >>> to also be applied to the dom0 number of vCPUs. >> >> FTR: My vague recollection (it has been nearly 10 years) is that I first had >> it there, too. Until I realized that it hardly ever would have any effect, >> because of the min(). And for Dom0-s with extremely few vCPU-s it seemed >> reasonable to not apply that cap there. > > I don't have a strong opinion regarding the fixes tag, but would like > to get this sorted as soon as possible. If you prefer just drop the > fixes tag. I think this wants backporting to all supported releases, > because it's an issue XenServer has hit on real servers when dom0 is > sized to use 16 vCPUs at most. In which case yes, I guess we want to keep the Fixes:. Jan
On 20/11/2024 11:35 am, Roger Pau Monne wrote: > The current calculation of PV dom0 pIRQs uses: > > n = min(fls(num_present_cpus()), dom0_max_vcpus()); > > The usage of fls() is wrong, as num_present_cpus() already returns the number > of present CPUs, not the bitmap mask of CPUs. > > Fix by removing the usage of fls(). > > Fixes: 7e73a6e7f12a ('have architectures specify the number of PIRQs a hardware domain gets') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Yeah, that fls() fails the dimensional analysis sniff test. Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Is there any hint as to what the reasoning was? ~Andrew
© 2016 - 2024 Red Hat, Inc.