[PATCH] ppc/spapr: Initialize max_cpus limit to an allowed usable limit.

Harsh Prateek Bora posted 1 patch 1 year ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20231122074802.868083-1-harshpb@linux.ibm.com
Maintainers: Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>, "Cédric Le Goater" <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>, Harsh Prateek Bora <harshpb@linux.ibm.com>
hw/ppc/spapr.c         | 9 +++------
include/hw/ppc/spapr.h | 1 +
2 files changed, 4 insertions(+), 6 deletions(-)
[PATCH] ppc/spapr: Initialize max_cpus limit to an allowed usable limit.
Posted by Harsh Prateek Bora 1 year ago
Initialize the machine specific max_cpus limit to a usable limit 4096.
Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE
limitation and keeping beyond 8192 will hit assert in tcg_region_init
or spapr_xive_claim_irq.

Logs:

Without patch fix:

[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
qemu-system-ppc64: IRQ 4096 is not free
[root@host build]#

On LPAR:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
**
ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
(region_size >= 2 * page_size)
Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
(region_size >= 2 * page_size)
Aborted (core dumped)
[root@host build]#

On x86:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq:
Assertion `lisn < xive->nr_irqs' failed.
Aborted (core dumped)
[root@host build]#

With patch fix:
[root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by
machine 'pseries-8.2' is 4096
[root@host build]#

Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr.c         | 9 +++------
 include/hw/ppc/spapr.h | 1 +
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index df09aa9d6a..1995949ea5 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->block_default_type = IF_SCSI;
 
     /*
-     * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values
-     * should be limited by the host capability instead of hardcoded.
-     * max_cpus for KVM guests will be checked in kvm_init(), and TCG
-     * guests are welcome to have as many CPUs as the host are capable
-     * of emulate.
+     * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(),
+     * In TCG the limit is restricted by max-irqs setup by XIVE which is 4096.
      */
-    mc->max_cpus = INT32_MAX;
+    mc->max_cpus = SPAPR_MAX_CPUS;
 
     mc->no_parallel = 1;
     mc->default_boot_order = "";
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e91791a1a9..210849a494 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -23,6 +23,7 @@ typedef struct SpaprPendingHpt SpaprPendingHpt;
 
 typedef struct Vof Vof;
 
+#define SPAPR_MAX_CPUS          4096
 #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
 #define SPAPR_ENTRY_POINT       0x100
 
-- 
2.39.3
Re: [PATCH] ppc/spapr: Initialize max_cpus limit to an allowed usable limit.
Posted by Cédric Le Goater 1 year ago
On 11/22/23 08:48, Harsh Prateek Bora wrote:
> Initialize the machine specific max_cpus limit to a usable limit 4096.
> Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE
> limitation and keeping beyond 8192 will hit assert in tcg_region_init
> or spapr_xive_claim_irq.

The IRQ number space is defined in include/hw/ppc/spapr_irq.h. XICS and
XIVE have the same IRQ number space, it is not a XIVE limitation. It
is how we organized interrupt numbers in the pseries-3.1 machine.

SPAPR_XIRQ_BASE defines an offset, at which the device IRQ numbers
start, and below that offset, the range of IRQ numbers is reserved
for IPIs. An assumption is made on the fact the both ranges, IPIs and
devices, are contiguous and there is a little shortcut being done with
the SPAPR_XIRQ_BASE define.

hw/ppc/spapr_irq.c:        qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE);
hw/ppc/spapr_irq.c:                                      smc->nr_xirqs + SPAPR_XIRQ_BASE);

This should use a SPAPR_NR_IPIS define (like we have a SPAPR_NR_XIRQS
define) instead, which could be used to define mc->max_cpus like we
define smc->nr_xirqs.

Thanks,

C.

  
> Logs:
> 
> Without patch fix:
> 
> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
> qemu-system-ppc64: IRQ 4096 is not free
> [root@host build]#
> 
> On LPAR:
> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
> **
> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
> (region_size >= 2 * page_size)
> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
> (region_size >= 2 * page_size)
> Aborted (core dumped)
> [root@host build]#
> 
> On x86:
> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq:
> Assertion `lisn < xive->nr_irqs' failed.
> Aborted (core dumped)
> [root@host build]#
> 
> With patch fix:
> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by
> machine 'pseries-8.2' is 4096
> [root@host build]#
> 
> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> ---
>   hw/ppc/spapr.c         | 9 +++------
>   include/hw/ppc/spapr.h | 1 +
>   2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index df09aa9d6a..1995949ea5 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -4647,13 +4647,10 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>       mc->block_default_type = IF_SCSI;
>   
>       /*
> -     * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values
> -     * should be limited by the host capability instead of hardcoded.
> -     * max_cpus for KVM guests will be checked in kvm_init(), and TCG
> -     * guests are welcome to have as many CPUs as the host are capable
> -     * of emulate.
> +     * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(),
> +     * In TCG the limit is restricted by max-irqs setup by XIVE which is 4096.
>        */
> -    mc->max_cpus = INT32_MAX;
> +    mc->max_cpus = SPAPR_MAX_CPUS;
>   
>       mc->no_parallel = 1;
>       mc->default_boot_order = "";
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index e91791a1a9..210849a494 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -23,6 +23,7 @@ typedef struct SpaprPendingHpt SpaprPendingHpt;
>   
>   typedef struct Vof Vof;
>   
> +#define SPAPR_MAX_CPUS          4096
>   #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>   #define SPAPR_ENTRY_POINT       0x100
>
Re: [PATCH] ppc/spapr: Initialize max_cpus limit to an allowed usable limit.
Posted by Harsh Prateek Bora 1 year ago

On 11/22/23 14:06, Cédric Le Goater wrote:
> On 11/22/23 08:48, Harsh Prateek Bora wrote:
>> Initialize the machine specific max_cpus limit to a usable limit 4096.
>> Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE
>> limitation and keeping beyond 8192 will hit assert in tcg_region_init
>> or spapr_xive_claim_irq.
> 
> The IRQ number space is defined in include/hw/ppc/spapr_irq.h. XICS and
> XIVE have the same IRQ number space, it is not a XIVE limitation. It
> is how we organized interrupt numbers in the pseries-3.1 machine.
> 
> SPAPR_XIRQ_BASE defines an offset, at which the device IRQ numbers
> start, and below that offset, the range of IRQ numbers is reserved
> for IPIs. An assumption is made on the fact the both ranges, IPIs and
> devices, are contiguous and there is a little shortcut being done with
> the SPAPR_XIRQ_BASE define.
> 
> hw/ppc/spapr_irq.c:        qdev_prop_set_uint32(dev, "nr-irqs", 
> smc->nr_xirqs + SPAPR_XIRQ_BASE);
> hw/ppc/spapr_irq.c:                                      smc->nr_xirqs + 
> SPAPR_XIRQ_BASE);
> 
> This should use a SPAPR_NR_IPIS define (like we have a SPAPR_NR_XIRQS
> define) instead, which could be used to define mc->max_cpus like we
> define smc->nr_xirqs.
> 

Thanks Cedric for your review comments.
I have posted a v2 incorporating your suggestion.

regards,
Harsh
> Thanks,
> 
> C.
> 
> 
>> Logs:
>>
>> Without patch fix:
>>
>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
>> qemu-system-ppc64: IRQ 4096 is not free
>> [root@host build]#
>>
>> On LPAR:
>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
>> **
>> ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
>> (region_size >= 2 * page_size)
>> Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed:
>> (region_size >= 2 * page_size)
>> Aborted (core dumped)
>> [root@host build]#
>>
>> On x86:
>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193
>> qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq:
>> Assertion `lisn < xive->nr_irqs' failed.
>> Aborted (core dumped)
>> [root@host build]#
>>
>> With patch fix:
>> [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097
>> qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by
>> machine 'pseries-8.2' is 4096
>> [root@host build]#
>>
>> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>> ---
>>   hw/ppc/spapr.c         | 9 +++------
>>   include/hw/ppc/spapr.h | 1 +
>>   2 files changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index df09aa9d6a..1995949ea5 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -4647,13 +4647,10 @@ static void 
>> spapr_machine_class_init(ObjectClass *oc, void *data)
>>       mc->block_default_type = IF_SCSI;
>>       /*
>> -     * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values
>> -     * should be limited by the host capability instead of hardcoded.
>> -     * max_cpus for KVM guests will be checked in kvm_init(), and TCG
>> -     * guests are welcome to have as many CPUs as the host are capable
>> -     * of emulate.
>> +     * While KVM determines max cpus in kvm_init() using 
>> kvm_max_vcpus(),
>> +     * In TCG the limit is restricted by max-irqs setup by XIVE which 
>> is 4096.
>>        */
>> -    mc->max_cpus = INT32_MAX;
>> +    mc->max_cpus = SPAPR_MAX_CPUS;
>>       mc->no_parallel = 1;
>>       mc->default_boot_order = "";
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index e91791a1a9..210849a494 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -23,6 +23,7 @@ typedef struct SpaprPendingHpt SpaprPendingHpt;
>>   typedef struct Vof Vof;
>> +#define SPAPR_MAX_CPUS          4096
>>   #define HPTE64_V_HPTE_DIRTY     0x0000000000000040ULL
>>   #define SPAPR_ENTRY_POINT       0x100
>>
>