arch/riscv/include/uapi/asm/kvm.h | 2 ++ arch/riscv/kvm/aia_imsic.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
Consider a system with 8 harts, where each hart supports 5
Guest Interrupt Files (GIFs), yielding 40 total GIFs.
If we launch a QEMU guest with over 5 vCPUs using
"-M virt,aia='aplic-imsic' -accel kvm,riscv-aia=hwaccel" – which
relies solely on VS-files (not SW-files) for higher performance – the
guest requires more than 5 GIFs. However, the current Linux scheduler
lacks GIF awareness, potentially scheduling >5 vCPUs to a single hart.
This triggers VS-file allocation failure, and since no handler exists
for this error, the QEMU guest becomes corrupted.
To address this, we introduce KVM_EXIT_FAIL_ENTRY_NO_VSFILE upon
VS-file allocation failure. This provides an opportunity for graceful
error handling instead of corruption. For example, QEMU can handle
this exit by rescheduling vCPUs to alternative harts when VS-file
allocation fails on the current hart [1].
[1] https://github.com/BillXiang/qemu/tree/riscv-vsfile-alloc/
Signed-off-by: BillXiang <xiangwencheng@lanxincomputing.com>
---
arch/riscv/include/uapi/asm/kvm.h | 2 ++
arch/riscv/kvm/aia_imsic.c | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index 5f59fd226cc5..be29c3502fe4 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -22,6 +22,8 @@
#define KVM_INTERRUPT_SET -1U
#define KVM_INTERRUPT_UNSET -2U
+#define KVM_EXIT_FAIL_ENTRY_NO_VSFILE (1ULL << 0)
+
/* for KVM_GET_REGS and KVM_SET_REGS */
struct kvm_regs {
};
diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c
index 29ef9c2133a9..69b0ab651389 100644
--- a/arch/riscv/kvm/aia_imsic.c
+++ b/arch/riscv/kvm/aia_imsic.c
@@ -760,7 +760,7 @@ int kvm_riscv_vcpu_aia_imsic_update(struct kvm_vcpu *vcpu)
/* For HW acceleration mode, we can't continue */
if (kvm->arch.aia.mode == KVM_DEV_RISCV_AIA_MODE_HWACCEL) {
run->fail_entry.hardware_entry_failure_reason =
- CSR_HSTATUS;
+ KVM_EXIT_FAIL_ENTRY_NO_VSFILE;
run->fail_entry.cpu = vcpu->cpu;
run->exit_reason = KVM_EXIT_FAIL_ENTRY;
return 0;
--
2.46.2.windows.1
On Mon, Jul 14, 2025 at 12:51 PM BillXiang <xiangwencheng@lanxincomputing.com> wrote: > > Consider a system with 8 harts, where each hart supports 5 > Guest Interrupt Files (GIFs), yielding 40 total GIFs. > If we launch a QEMU guest with over 5 vCPUs using > "-M virt,aia='aplic-imsic' -accel kvm,riscv-aia=hwaccel" – which > relies solely on VS-files (not SW-files) for higher performance – the > guest requires more than 5 GIFs. However, the current Linux scheduler > lacks GIF awareness, potentially scheduling >5 vCPUs to a single hart. > This triggers VS-file allocation failure, and since no handler exists > for this error, the QEMU guest becomes corrupted. > > To address this, we introduce KVM_EXIT_FAIL_ENTRY_NO_VSFILE upon > VS-file allocation failure. This provides an opportunity for graceful > error handling instead of corruption. For example, QEMU can handle > this exit by rescheduling vCPUs to alternative harts when VS-file > allocation fails on the current hart [1]. Currently, we return CSR_HSTATUS as hardware_entry_failure_reason which is vague so it is better to return a well defined value provided via uapi/asm/kvm.h. In general, this patch is fine but the commit description needs to be improved along these lines. Regards, Anup > > [1] https://github.com/BillXiang/qemu/tree/riscv-vsfile-alloc/ > > Signed-off-by: BillXiang <xiangwencheng@lanxincomputing.com> > --- > arch/riscv/include/uapi/asm/kvm.h | 2 ++ > arch/riscv/kvm/aia_imsic.c | 2 +- > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h > index 5f59fd226cc5..be29c3502fe4 100644 > --- a/arch/riscv/include/uapi/asm/kvm.h > +++ b/arch/riscv/include/uapi/asm/kvm.h > @@ -22,6 +22,8 @@ > #define KVM_INTERRUPT_SET -1U > #define KVM_INTERRUPT_UNSET -2U > > +#define KVM_EXIT_FAIL_ENTRY_NO_VSFILE (1ULL << 0) > + > /* for KVM_GET_REGS and KVM_SET_REGS */ > struct kvm_regs { > }; > diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c > index 29ef9c2133a9..69b0ab651389 100644 > --- a/arch/riscv/kvm/aia_imsic.c > +++ b/arch/riscv/kvm/aia_imsic.c > @@ -760,7 +760,7 @@ int kvm_riscv_vcpu_aia_imsic_update(struct kvm_vcpu *vcpu) > /* For HW acceleration mode, we can't continue */ > if (kvm->arch.aia.mode == KVM_DEV_RISCV_AIA_MODE_HWACCEL) { > run->fail_entry.hardware_entry_failure_reason = > - CSR_HSTATUS; > + KVM_EXIT_FAIL_ENTRY_NO_VSFILE; > run->fail_entry.cpu = vcpu->cpu; > run->exit_reason = KVM_EXIT_FAIL_ENTRY; > return 0; > -- > 2.46.2.windows.1
© 2016 - 2025 Red Hat, Inc.