[PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip

Ani Sinha posted 1 patch 1 month, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260413090010.60339-1-anisinha@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
accel/kvm/kvm-all.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
[PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip
Posted by Ani Sinha 1 month, 2 weeks ago
During refactoring of kvm_irqchip_create(), the refactored code was returning
early from do_kvm_irqchip_create() function if the required capabilities were
not present in KVM. This was not translating to an early return from
kvm_irqchip_create() as was the case before refactoring. This is because,
do_kvm_irqchip_create() did not have a means to notify the caller of the lack
of kvm capabilities. Fix this by making do_notify_irqchip_create() return
EOPNOTSUPP error when required capabilities are absent and then the caller
can check the return code and return early.

Fixes: 98884e0cc1 ("accel/kvm: add changes required to support KVM VM file descriptor change")
Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
Reported-by: Gautam Menghani <gautam@linux.ibm.com>
Suggested-by: Fabiano Rosas <farosas@suse.de>
Suggested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
---
 accel/kvm/kvm-all.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 774499d34f..92af42503b 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2575,7 +2575,7 @@ void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi)
     g_hash_table_insert(s->gsimap, irq, GINT_TO_POINTER(gsi));
 }
 
-static void do_kvm_irqchip_create(KVMState *s)
+static int do_kvm_irqchip_create(KVMState *s)
 {
     int ret;
     if (kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
@@ -2587,7 +2587,7 @@ static void do_kvm_irqchip_create(KVMState *s)
             exit(1);
         }
     } else {
-        return;
+        return -EOPNOTSUPP;
     }
 
     if (kvm_check_extension(s, KVM_CAP_IRQFD) <= 0) {
@@ -2610,13 +2610,17 @@ static void do_kvm_irqchip_create(KVMState *s)
         fprintf(stderr, "Create kernel irqchip failed: %s\n", strerror(-ret));
         exit(1);
     }
+
+    return 0;
 }
 
 static void kvm_irqchip_create(KVMState *s)
 {
     assert(s->kernel_irqchip_split != ON_OFF_AUTO_AUTO);
 
-    do_kvm_irqchip_create(s);
+    if (do_kvm_irqchip_create(s) < 0) {
+        return;
+    }
     kvm_kernel_irqchip = true;
     /* If we have an in-kernel IRQ chip then we must have asynchronous
      * interrupt delivery (though the reverse is not necessarily true)
@@ -2835,6 +2839,7 @@ static int kvm_reset_vmfd(MachineState *ms)
     }
 
     if (s->kernel_irqchip_allowed) {
+        /* ignore return from this function */
         do_kvm_irqchip_create(s);
     }
 
-- 
2.49.0
Re: [PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip
Posted by Misbah Anjum N 1 month, 2 weeks ago
Hi Ani,
I've tested the v1 patch: [PATCH for-11.0] accel/kvm: return early from 
kvm_irqchip_create if kvm does not support irqchip, on PowerPC machine 
and it resolves the boot hang issue seen when booting KVM guest with >1 
smp value.

Regression Details:
Commit 98884e0cc1 ("accel/kvm: add changes required to support KVM VM 
file descriptor change") led to a regression which caused KVM guests on 
ppc64le to hang indefinitely during boot when SMP was configured. Single 
vCPU guests booted successfully, but any configuration with more than 
one vCPU would hang.
The hang occurred at bql_lock() in qemu_default_main() 
(system/main.c:49), where the system would become completely 
unresponsive after successfully completing qemu_system_reset().
Reference: 
https://lists.nongnu.org/archive/html/qemu-ppc/2026-03/msg00035.html

Test Environment:
- Host Arch: ppc64le
- Host and Guest OS: Fedora 42
- Machine Type: pseries with KVM acceleration
- QEMU: Latest master with this patch applied

Test Results:
All the following SMP topologies now boot successfully:

Single and simple multi-CPU:
- -smp 1
- -smp 2
- -smp 4
- -smp 32

Various socket/core/thread combinations (8 vCPUs):
- -smp 8,sockets=8,cores=1,threads=1
- -smp 8,sockets=1,cores=8,threads=1
- -smp 8,sockets=1,cores=1,threads=8
- -smp 8,sockets=2,cores=4,threads=1
- -smp 8,sockets=1,cores=4,threads=2
- -smp 8,sockets=2,cores=1,threads=4
- -smp 8,sockets=2,cores=2,threads=2

Higher vCPU count:
- -smp 16,sockets=2,cores=4,threads=2
- -smp 32,sockets=1,cores=8,threads=4

Tested-by: Misbah Anjum N <misanjum@linux.ibm.com>

Thanks,
Misbah Anjum N <misanjum@linux.ibm.com>

On 2026-04-13 14:30, Ani Sinha wrote:
> During refactoring of kvm_irqchip_create(), the refactored code was 
> returning
> early from do_kvm_irqchip_create() function if the required 
> capabilities were
> not present in KVM. This was not translating to an early return from
> kvm_irqchip_create() as was the case before refactoring. This is 
> because,
> do_kvm_irqchip_create() did not have a means to notify the caller of 
> the lack
> of kvm capabilities. Fix this by making do_notify_irqchip_create() 
> return
> EOPNOTSUPP error when required capabilities are absent and then the 
> caller
> can check the return code and return early.
> 
> Fixes: 98884e0cc1 ("accel/kvm: add changes required to support KVM VM
> file descriptor change")
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Reported-by: Gautam Menghani <gautam@linux.ibm.com>
> Suggested-by: Fabiano Rosas <farosas@suse.de>
> Suggested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>  accel/kvm/kvm-all.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 774499d34f..92af42503b 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2575,7 +2575,7 @@ void kvm_irqchip_set_qemuirq_gsi(KVMState *s,
> qemu_irq irq, int gsi)
>      g_hash_table_insert(s->gsimap, irq, GINT_TO_POINTER(gsi));
>  }
> 
> -static void do_kvm_irqchip_create(KVMState *s)
> +static int do_kvm_irqchip_create(KVMState *s)
>  {
>      int ret;
>      if (kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
> @@ -2587,7 +2587,7 @@ static void do_kvm_irqchip_create(KVMState *s)
>              exit(1);
>          }
>      } else {
> -        return;
> +        return -EOPNOTSUPP;
>      }
> 
>      if (kvm_check_extension(s, KVM_CAP_IRQFD) <= 0) {
> @@ -2610,13 +2610,17 @@ static void do_kvm_irqchip_create(KVMState *s)
>          fprintf(stderr, "Create kernel irqchip failed: %s\n", 
> strerror(-ret));
>          exit(1);
>      }
> +
> +    return 0;
>  }
> 
>  static void kvm_irqchip_create(KVMState *s)
>  {
>      assert(s->kernel_irqchip_split != ON_OFF_AUTO_AUTO);
> 
> -    do_kvm_irqchip_create(s);
> +    if (do_kvm_irqchip_create(s) < 0) {
> +        return;
> +    }
>      kvm_kernel_irqchip = true;
>      /* If we have an in-kernel IRQ chip then we must have asynchronous
>       * interrupt delivery (though the reverse is not necessarily true)
> @@ -2835,6 +2839,7 @@ static int kvm_reset_vmfd(MachineState *ms)
>      }
> 
>      if (s->kernel_irqchip_allowed) {
> +        /* ignore return from this function */
>          do_kvm_irqchip_create(s);
>      }
Re: [PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip
Posted by Peter Maydell 1 month, 2 weeks ago
On Mon, 13 Apr 2026 at 10:01, Ani Sinha <anisinha@redhat.com> wrote:
>
> During refactoring of kvm_irqchip_create(), the refactored code was returning
> early from do_kvm_irqchip_create() function if the required capabilities were
> not present in KVM. This was not translating to an early return from
> kvm_irqchip_create() as was the case before refactoring. This is because,
> do_kvm_irqchip_create() did not have a means to notify the caller of the lack
> of kvm capabilities. Fix this by making do_notify_irqchip_create() return
> EOPNOTSUPP error when required capabilities are absent and then the caller
> can check the return code and return early.

As a fix intended for 11.0 at this point in the release cycle,
it would be helpful if the commit message stated what the
user-visible consqeuences of this bug are. That helps in
identifying whether this really does need to go into 11.0
(i.e. whether it is release-critical).

> Fixes: 98884e0cc1 ("accel/kvm: add changes required to support KVM VM file descriptor change")
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Reported-by: Gautam Menghani <gautam@linux.ibm.com>
> Suggested-by: Fabiano Rosas <farosas@suse.de>
> Suggested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

as the minimal fix for the logic-bug introduced by 98884e0cc1.

thanks
-- PMM
Re: [PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip
Posted by Ani Sinha 1 month, 2 weeks ago

> On 13 Apr 2026, at 5:40 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On Mon, 13 Apr 2026 at 10:01, Ani Sinha <anisinha@redhat.com> wrote:
>> 
>> During refactoring of kvm_irqchip_create(), the refactored code was returning
>> early from do_kvm_irqchip_create() function if the required capabilities were
>> not present in KVM. This was not translating to an early return from
>> kvm_irqchip_create() as was the case before refactoring. This is because,
>> do_kvm_irqchip_create() did not have a means to notify the caller of the lack
>> of kvm capabilities. Fix this by making do_notify_irqchip_create() return
>> EOPNOTSUPP error when required capabilities are absent and then the caller
>> can check the return code and return early.
> 
> As a fix intended for 11.0 at this point in the release cycle,
> it would be helpful if the commit message stated what the
> user-visible consqeuences of this bug are.

Ok I would appreciate if you can add this line in the commit log before putting it in CI:

“Due to this regression, all KVM guests on PPC 64 hang immediately during startup"

Or if you prefer, I can spin up a new version with the updated commit message.

> That helps in
> identifying whether this really does need to go into 11.0
> (i.e. whether it is release-critical).
> 
>> Fixes: 98884e0cc1 ("accel/kvm: add changes required to support KVM VM file descriptor change")
>> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
>> Reported-by: Gautam Menghani <gautam@linux.ibm.com>
>> Suggested-by: Fabiano Rosas <farosas@suse.de>
>> Suggested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> as the minimal fix for the logic-bug introduced by 98884e0cc1.
> 
> thanks
> -- PMM
> 
Re: [PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip
Posted by Peter Maydell 1 month, 2 weeks ago
On Mon, 13 Apr 2026 at 10:01, Ani Sinha <anisinha@redhat.com> wrote:
>
> During refactoring of kvm_irqchip_create(), the refactored code was returning
> early from do_kvm_irqchip_create() function if the required capabilities were
> not present in KVM. This was not translating to an early return from
> kvm_irqchip_create() as was the case before refactoring. This is because,
> do_kvm_irqchip_create() did not have a means to notify the caller of the lack
> of kvm capabilities. Fix this by making do_notify_irqchip_create() return
> EOPNOTSUPP error when required capabilities are absent and then the caller
> can check the return code and return early.
>
> Fixes: 98884e0cc1 ("accel/kvm: add changes required to support KVM VM file descriptor change")
> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
> Reported-by: Gautam Menghani <gautam@linux.ibm.com>
> Suggested-by: Fabiano Rosas <farosas@suse.de>
> Suggested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>  accel/kvm/kvm-all.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 774499d34f..92af42503b 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2575,7 +2575,7 @@ void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi)
>      g_hash_table_insert(s->gsimap, irq, GINT_TO_POINTER(gsi));
>  }
>
> -static void do_kvm_irqchip_create(KVMState *s)
> +static int do_kvm_irqchip_create(KVMState *s)
>  {
>      int ret;
>      if (kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
> @@ -2587,7 +2587,7 @@ static void do_kvm_irqchip_create(KVMState *s)
>              exit(1);
>          }
>      } else {
> -        return;
> +        return -EOPNOTSUPP;
>      }
>
>      if (kvm_check_extension(s, KVM_CAP_IRQFD) <= 0) {

This function confuses me. With this patch it can do three things:
 * report an error and exit(1)
 * return success
 * return a failure code

What is the difference between the cases where we exit and
where we return the error code? The commit message says we
return an error "when required capabilities are absent", but
we do the error-exit for e.g. "KVM_CAP_IRQFD not present", which
sounds like "required capabilities are absent" to me...

thanks
-- PMM
Re: [PATCH for-11.0] accel/kvm: return early from kvm_irqchip_create if kvm does not support irqchip
Posted by Ani Sinha 1 month, 2 weeks ago

> On 13 Apr 2026, at 3:03 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On Mon, 13 Apr 2026 at 10:01, Ani Sinha <anisinha@redhat.com> wrote:
>> 
>> During refactoring of kvm_irqchip_create(), the refactored code was returning
>> early from do_kvm_irqchip_create() function if the required capabilities were
>> not present in KVM. This was not translating to an early return from
>> kvm_irqchip_create() as was the case before refactoring. This is because,
>> do_kvm_irqchip_create() did not have a means to notify the caller of the lack
>> of kvm capabilities. Fix this by making do_notify_irqchip_create() return
>> EOPNOTSUPP error when required capabilities are absent and then the caller
>> can check the return code and return early.
>> 
>> Fixes: 98884e0cc1 ("accel/kvm: add changes required to support KVM VM file descriptor change")
>> Reported-by: Misbah Anjum N <misanjum@linux.ibm.com>
>> Reported-by: Gautam Menghani <gautam@linux.ibm.com>
>> Suggested-by: Fabiano Rosas <farosas@suse.de>
>> Suggested-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>> ---
>> accel/kvm/kvm-all.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>> 
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index 774499d34f..92af42503b 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -2575,7 +2575,7 @@ void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi)
>>     g_hash_table_insert(s->gsimap, irq, GINT_TO_POINTER(gsi));
>> }
>> 
>> -static void do_kvm_irqchip_create(KVMState *s)
>> +static int do_kvm_irqchip_create(KVMState *s)
>> {
>>     int ret;
>>     if (kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
>> @@ -2587,7 +2587,7 @@ static void do_kvm_irqchip_create(KVMState *s)
>>             exit(1);
>>         }
>>     } else {
>> -        return;
>> +        return -EOPNOTSUPP;
>>     }
>> 
>>     if (kvm_check_extension(s, KVM_CAP_IRQFD) <= 0) {
> 
> This function confuses me. With this patch it can do three things:
> * report an error and exit(1)
> * return success
> * return a failure code
> 
> What is the difference between the cases where we exit and
> where we return the error code? The commit message says we
> return an error "when required capabilities are absent", but
> we do the error-exit for e.g. "KVM_CAP_IRQFD not present", which
> sounds like "required capabilities are absent" to me...

I think it should be an assertion there. That is instead of exit(1) we should have had

assert(kvm_check_extension(s, KVM_CAP_IRQFD));

That is you can’t have KVM_CAP_IRQCHIP cap and not have KVM_CAP_IRQFD . That is how I understand it.