arch/riscv/include/asm/kvm_vcpu_sbi.h | 23 +- arch/riscv/include/uapi/asm/kvm.h | 14 ++ arch/riscv/kvm/vcpu.c | 3 +- arch/riscv/kvm/vcpu_onereg.c | 60 +----- arch/riscv/kvm/vcpu_sbi.c | 172 ++++++++++++--- arch/riscv/kvm/vcpu_sbi_fwft.c | 199 ++++++++++++++++-- arch/riscv/kvm/vcpu_sbi_sta.c | 64 ++++-- .../selftests/kvm/riscv/get-reg-list.c | 28 +++ 8 files changed, 436 insertions(+), 127 deletions(-)
This series adds ONE_REG interface for SBI FWFT extension implemented by KVM RISC-V. This was missed out in accepted SBI FWFT patches for KVM RISC-V. These patches can also be found in the riscv_kvm_fwft_one_reg_v1 branch at: https://github.com/avpatel/linux.git Anup Patel (6): RISC-V: KVM: Set initial value of hedeleg in kvm_arch_vcpu_create() RISC-V: KVM: Introduce feature specific reset for SBI FWFT RISC-V: KVM: Introduce optional ONE_REG callbacks for SBI extensions RISC-V: KVM: Move copy_sbi_ext_reg_indices() to SBI implementation RISC-V: KVM: Implement ONE_REG interface for SBI FWFT state KVM: riscv: selftests: Add SBI FWFT to get-reg-list test arch/riscv/include/asm/kvm_vcpu_sbi.h | 23 +- arch/riscv/include/uapi/asm/kvm.h | 14 ++ arch/riscv/kvm/vcpu.c | 3 +- arch/riscv/kvm/vcpu_onereg.c | 60 +----- arch/riscv/kvm/vcpu_sbi.c | 172 ++++++++++++--- arch/riscv/kvm/vcpu_sbi_fwft.c | 199 ++++++++++++++++-- arch/riscv/kvm/vcpu_sbi_sta.c | 64 ++++-- .../selftests/kvm/riscv/get-reg-list.c | 28 +++ 8 files changed, 436 insertions(+), 127 deletions(-) -- 2.43.0
2025-08-14T21:25:42+05:30, Anup Patel <apatel@ventanamicro.com>: > This series adds ONE_REG interface for SBI FWFT extension implemented > by KVM RISC-V. I think it would be better to ONE_REG the CSRs (medeleg/menvcfg), or at least expose their CSR fields (each sensible medeleg bit, PMM, ...) through kvm_riscv_config, than to couple this with SBI/FWFT. The controlled behavior is defined by the ISA, and userspace might want to configure the S-mode execution environment even when SBI/FWFT is not present, which is not possible with the current design. Is there a benefit in expressing the ISA model through SBI/FWFT? Thanks.
On Mon, Aug 18, 2025 at 3:59 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: > > 2025-08-14T21:25:42+05:30, Anup Patel <apatel@ventanamicro.com>: > > This series adds ONE_REG interface for SBI FWFT extension implemented > > by KVM RISC-V. > > I think it would be better to ONE_REG the CSRs (medeleg/menvcfg), or at > least expose their CSR fields (each sensible medeleg bit, PMM, ...) > through kvm_riscv_config, than to couple this with SBI/FWFT. > > The controlled behavior is defined by the ISA, and userspace might want > to configure the S-mode execution environment even when SBI/FWFT is not > present, which is not possible with the current design. > > Is there a benefit in expressing the ISA model through SBI/FWFT? > Exposing medeleg/menvcfg is not the right approach because a Guest/VM does not have M-mode hence it is not appropriate to expose m<xyz> CSRs via ONE_REG interface. This also aligns with H-extension architecture which does not virtualize M-mode. We already had discussions about this in the past. As such, we have two options. One option is to expose hedeleg/henvcfg via kvm_riscv_config and another option is to have a separate ONE_REG for each FWFT feature. Separate ONE_REG registers for each FWFT feature is better than directly exposing hedeleg/henvcfg via ONE_REG because: 1) Once nested virtualization lands, we will be having separate hedeleg/henvcfg as part of nested virtualization state of Guest which is trap-n-emulated by KVM. The existence of hedeleg/henvcfg in kvm_riscv_config and nested virtualization state will only create more confusion. 2) Not all bits in hedeleg/henvcfg are used for FWFT since quite a few bits are programmed with fixed value based on KVM implementation choices (which may change in future). Also, things like set_debug_ioctl() change hedeleg at runtime which allow KVM user space to decide who takes breakpoint traps from Guest/VM. This means value saved/restored through hedeleg/henvcfg in kvm_riscv_config becomes specific to the kernel version and specific to host ISA features. 3) We anyway need to provide ONE_REG interface to save/restore FWFT feature flags so it's better to keep the FWFT feature value as part of the same ONE_REG interface. 4) The availability of quite a few FWFT features is dependent on corresponding ISA extensions so having separate ONE_REG registers of each FWFT feature allows get_reg_list_ioctl() to provide KVM user-space only available FWFT feature registers. Regards, Anup
2025-08-19T12:00:43+05:30, Anup Patel <apatel@ventanamicro.com>: > On Mon, Aug 18, 2025 at 3:59 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: >> >> 2025-08-14T21:25:42+05:30, Anup Patel <apatel@ventanamicro.com>: >> > This series adds ONE_REG interface for SBI FWFT extension implemented >> > by KVM RISC-V. >> >> I think it would be better to ONE_REG the CSRs (medeleg/menvcfg), or at >> least expose their CSR fields (each sensible medeleg bit, PMM, ...) >> through kvm_riscv_config, than to couple this with SBI/FWFT. >> >> The controlled behavior is defined by the ISA, and userspace might want >> to configure the S-mode execution environment even when SBI/FWFT is not >> present, which is not possible with the current design. >> >> Is there a benefit in expressing the ISA model through SBI/FWFT? >> > > Exposing medeleg/menvcfg is not the right approach because a > Guest/VM does not have M-mode hence it is not appropriate to > expose m<xyz> CSRs via ONE_REG interface. This also aligns > with H-extension architecture which does not virtualize M-mode. We already have mvendorid, marchid, and mipid in kvm_riscv_config. The virtualized M-mode is userspace+KVM. (KVM doesn't allow userspace to configure most things now, but I think we'll have to change that when getting ready for production.) > We already had discussions about this in the past. > > As such, we have two options. One option is to expose > hedeleg/henvcfg via kvm_riscv_config and another option > is to have a separate ONE_REG for each FWFT feature. > > Separate ONE_REG registers for each FWFT feature is better > than directly exposing hedeleg/henvcfg via ONE_REG because: > > 1) Once nested virtualization lands, we will be having separate > hedeleg/henvcfg as part of nested virtualization state of Guest > which is trap-n-emulated by KVM. The existence of hedeleg/henvcfg > in kvm_riscv_config and nested virtualization state will only create > more confusion. Right, the userspace registers for this can't be called h*. > 2) Not all bits in hedeleg/henvcfg are used for FWFT since quite > a few bits are programmed with fixed value based on KVM > implementation choices (which may change in future). Yes, we'll want to expose some to userspace. > Also, > things like set_debug_ioctl() change hedeleg at runtime > which allow KVM user space to decide who takes breakpoint > traps from Guest/VM. This is still doable. The clear hedeleg bit does not have to change the virtualized behavior -- if the guest is expecting to see breakpoint traps, then even if userspace+KVM configure the architecture to direct the traps to the hypervisor, they must then forward the breakpoints that were supposed to be delivered to the guest. > This means value saved/restored > through hedeleg/henvcfg in kvm_riscv_config becomes > specific to the kernel version and specific to host ISA features. Hedeleg/henvcfg bits do not have to be the same as userspace interface bits -- KVM always has to distinguish what the userspace wants to virtualize, and what the KVM changed for its own reasons. > 3) We anyway need to provide ONE_REG interface to > save/restore FWFT feature flags so it's better to keep the > FWFT feature value as part of the same ONE_REG interface. I think we want to have SBI in userspace (especially for single-shot ecalls like FWFT). The userspace implementation will want an interface to set the ISA bits, and it's very awkward with the proposed design. Flags can to stay, in case the userpace wants to accelerate FWFT. > 4) The availability of quite a few FWFT features is dependent > on corresponding ISA extensions so having separate ONE_REG > registers of each FWFT feature allows get_reg_list_ioctl() to > provide KVM user-space only available FWFT feature registers. Yes, but similarly the userspace would be forbidden from setting bits that cannot be expressed in henvcfg/hededeg. There are also behaviors we want to configure that do not have a FWFT toggle. e.g. the recent patches for delegation of illegal-instruction exceptions that changed the guest behavior -- someone might want to keep incrementing the SBI PMU counter, and someone will want to forward them to be implemented in userspace (when developing a new extension, because most of the existing ISA can still be accelerated by KVM). For general virtualization, we want to be able to configure the following behavior for each exception that would go to the virtualized M-mode: 0) delegated to the guest 1) implemented by userspace 2-N) implementations by KVM (ideally zero or one) We can have medeleg, and another method to decide how to handle trapped exceptions, but it probably makes more sense to have a per-exception ONE_REG that sets how each exception behaves. The FWFT value could be a part of a more general interface. Thanks.
On Tue, Aug 19, 2025 at 5:13 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: > > 2025-08-19T12:00:43+05:30, Anup Patel <apatel@ventanamicro.com>: > > On Mon, Aug 18, 2025 at 3:59 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: > >> > >> 2025-08-14T21:25:42+05:30, Anup Patel <apatel@ventanamicro.com>: > >> > This series adds ONE_REG interface for SBI FWFT extension implemented > >> > by KVM RISC-V. > >> > >> I think it would be better to ONE_REG the CSRs (medeleg/menvcfg), or at > >> least expose their CSR fields (each sensible medeleg bit, PMM, ...) > >> through kvm_riscv_config, than to couple this with SBI/FWFT. > >> > >> The controlled behavior is defined by the ISA, and userspace might want > >> to configure the S-mode execution environment even when SBI/FWFT is not > >> present, which is not possible with the current design. > >> > >> Is there a benefit in expressing the ISA model through SBI/FWFT? > >> > > > > Exposing medeleg/menvcfg is not the right approach because a > > Guest/VM does not have M-mode hence it is not appropriate to > > expose m<xyz> CSRs via ONE_REG interface. This also aligns > > with H-extension architecture which does not virtualize M-mode. > > We already have mvendorid, marchid, and mipid in kvm_riscv_config. The mvendorid, marchid, and mipid are accessible via SBI BASE extension but not any other M-mode CSRs hence these are special. > > The virtualized M-mode is userspace+KVM. (KVM doesn't allow userspace > to configure most things now, but I think we'll have to change that when > getting ready for production.) The RISC-V architecture is not designed to virtualize M-mode and there is no practical use-case for virtualized M-mode hence WE WON'T BE SUPPORTING IT IN KVM RISC-V. FYI, the KVM ARM64 does not virtualize EL3 either and it is already in production so please stop making random arguments for requiring virtualized M-mode for production. > > > We already had discussions about this in the past. > > > > As such, we have two options. One option is to expose > > hedeleg/henvcfg via kvm_riscv_config and another option > > is to have a separate ONE_REG for each FWFT feature. > > > > Separate ONE_REG registers for each FWFT feature is better > > than directly exposing hedeleg/henvcfg via ONE_REG because: > > > > 1) Once nested virtualization lands, we will be having separate > > hedeleg/henvcfg as part of nested virtualization state of Guest > > which is trap-n-emulated by KVM. The existence of hedeleg/henvcfg > > in kvm_riscv_config and nested virtualization state will only create > > more confusion. > > Right, the userspace registers for this can't be called h*. > > > 2) Not all bits in hedeleg/henvcfg are used for FWFT since quite > > a few bits are programmed with fixed value based on KVM > > implementation choices (which may change in future). > > Yes, we'll want to expose some to userspace. > > > Also, > > things like set_debug_ioctl() change hedeleg at runtime > > which allow KVM user space to decide who takes breakpoint > > traps from Guest/VM. > > This is still doable. The clear hedeleg bit does not have to change the > virtualized behavior -- if the guest is expecting to see breakpoint > traps, then even if userspace+KVM configure the architecture to direct > the traps to the hypervisor, they must then forward the breakpoints that > were supposed to be delivered to the guest. > > > This means value saved/restored > > through hedeleg/henvcfg in kvm_riscv_config becomes > > specific to the kernel version and specific to host ISA features. > > Hedeleg/henvcfg bits do not have to be the same as userspace interface > bits -- KVM always has to distinguish what the userspace wants to > virtualize, and what the KVM changed for its own reasons. > > > 3) We anyway need to provide ONE_REG interface to > > save/restore FWFT feature flags so it's better to keep the > > FWFT feature value as part of the same ONE_REG interface. > > I think we want to have SBI in userspace (especially for single-shot > ecalls like FWFT). The userspace implementation will want an interface > to set the ISA bits, and it's very awkward with the proposed design. > > Flags can to stay, in case the userpace wants to accelerate FWFT. > > > 4) The availability of quite a few FWFT features is dependent > > on corresponding ISA extensions so having separate ONE_REG > > registers of each FWFT feature allows get_reg_list_ioctl() to > > provide KVM user-space only available FWFT feature registers. > > Yes, but similarly the userspace would be forbidden from setting bits > that cannot be expressed in henvcfg/hededeg. Instead of having henvcfg/hededeg via ONE_REG with restrictions, it's much cleaner and maintainable to have a separate ONE_REG register for each FWFT feature. > > There are also behaviors we want to configure that do not have a FWFT > toggle. e.g. the recent patches for delegation of illegal-instruction > exceptions that changed the guest behavior -- someone might want to > keep incrementing the SBI PMU counter, and someone will want to forward > them to be implemented in userspace (when developing a new extension, > because most of the existing ISA can still be accelerated by KVM). There are alternate approaches for supporting SBI PMU counters in user-mode where we don't need virtualized M-mode. In any case, we will support user-mode emulated perf counters only when absolutely needed. > > For general virtualization, we want to be able to configure the > following behavior for each exception that would go to the virtualized > M-mode: > 0) delegated to the guest > 1) implemented by userspace > 2-N) implementations by KVM (ideally zero or one) > > We can have medeleg, and another method to decide how to handle trapped > exceptions, but it probably makes more sense to have a per-exception > ONE_REG that sets how each exception behaves. > No pointing in discussing this further since we won't be supporting virtualized M-mode. -- Anup
2025-08-19T21:22:27+05:30, Anup Patel <apatel@ventanamicro.com>: > On Tue, Aug 19, 2025 at 5:13 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: >> >> 2025-08-19T12:00:43+05:30, Anup Patel <apatel@ventanamicro.com>: >> > On Mon, Aug 18, 2025 at 3:59 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: >> >> >> >> 2025-08-14T21:25:42+05:30, Anup Patel <apatel@ventanamicro.com>: >> >> > This series adds ONE_REG interface for SBI FWFT extension implemented >> >> > by KVM RISC-V. >> >> >> >> I think it would be better to ONE_REG the CSRs (medeleg/menvcfg), or at >> >> least expose their CSR fields (each sensible medeleg bit, PMM, ...) >> >> through kvm_riscv_config, than to couple this with SBI/FWFT. >> >> >> >> The controlled behavior is defined by the ISA, and userspace might want >> >> to configure the S-mode execution environment even when SBI/FWFT is not >> >> present, which is not possible with the current design. >> >> >> >> Is there a benefit in expressing the ISA model through SBI/FWFT? >> >> >> > >> > Exposing medeleg/menvcfg is not the right approach because a >> > Guest/VM does not have M-mode hence it is not appropriate to >> > expose m<xyz> CSRs via ONE_REG interface. This also aligns >> > with H-extension architecture which does not virtualize M-mode. >> >> We already have mvendorid, marchid, and mipid in kvm_riscv_config. > > The mvendorid, marchid, and mipid are accessible via SBI BASE > extension but not any other M-mode CSRs hence these are special. > >> >> The virtualized M-mode is userspace+KVM. (KVM doesn't allow userspace >> to configure most things now, but I think we'll have to change that when >> getting ready for production.) > > The RISC-V architecture is not designed to virtualize M-mode > and there is no practical use-case for virtualized M-mode hence > WE WON'T BE SUPPORTING IT IN KVM RISC-V. Oh, sorry for the misunderstanding, I'll be clearer next time and talk about implementation of the supervisor execution environment. KVM+userspace provides SEE to the VS-mode, which is to VS-mode as what M-mode is to S-mode, hence I called KVM+userspace a virtualized M-mode. > FYI, the KVM ARM64 does not virtualize EL3 either and it is > already in production so please stop making random arguments > for requiring virtualized M-mode for production. Yeah, I agree that we don't need it, I just had to provide so many examples in the previous discussion that I went into quite niche cases. The increased flexibility is similarly useful for more important cases: we can't avoid "virtualized M-mode"/SEE, but we don't have to completely implement it in HS-mode. >> For general virtualization, we want to be able to configure the >> following behavior for each exception that would go to the virtualized >> M-mode: >> 0) delegated to the guest >> 1) implemented by userspace >> 2-N) implementations by KVM (ideally zero or one) >> >> We can have medeleg, and another method to decide how to handle trapped >> exceptions, but it probably makes more sense to have a per-exception >> ONE_REG that sets how each exception behaves. >> > > No pointing in discussing this further since we won't be supporting > virtualized M-mode. I understand, back to the current series: I think we need to provide means with which userspace can control which FWFT features are enabled, because KVM just exposes everything it know and hardware supports right now: 1) Migration between different systems would be hindered 2) We couldn't add more FWFT features without breaking the SEE The (2) is similar to how we must set ".default_disabled = true" to current FWFT, because KVM can't be changing the SEE for userspace. Do you want me to send a patch that inverts the default, to make all future SBI extension start as disabled, so we can't easily repeat the mistake in the future? Thanks.
On Tue, Aug 19, 2025 at 11:05 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: > > 2025-08-19T21:22:27+05:30, Anup Patel <apatel@ventanamicro.com>: > > On Tue, Aug 19, 2025 at 5:13 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: > >> > >> 2025-08-19T12:00:43+05:30, Anup Patel <apatel@ventanamicro.com>: > >> > On Mon, Aug 18, 2025 at 3:59 PM Radim Krčmář <rkrcmar@ventanamicro.com> wrote: > >> >> > >> >> 2025-08-14T21:25:42+05:30, Anup Patel <apatel@ventanamicro.com>: > >> >> > This series adds ONE_REG interface for SBI FWFT extension implemented > >> >> > by KVM RISC-V. > >> >> > >> >> I think it would be better to ONE_REG the CSRs (medeleg/menvcfg), or at > >> >> least expose their CSR fields (each sensible medeleg bit, PMM, ...) > >> >> through kvm_riscv_config, than to couple this with SBI/FWFT. > >> >> > >> >> The controlled behavior is defined by the ISA, and userspace might want > >> >> to configure the S-mode execution environment even when SBI/FWFT is not > >> >> present, which is not possible with the current design. > >> >> > >> >> Is there a benefit in expressing the ISA model through SBI/FWFT? > >> >> > >> > > >> > Exposing medeleg/menvcfg is not the right approach because a > >> > Guest/VM does not have M-mode hence it is not appropriate to > >> > expose m<xyz> CSRs via ONE_REG interface. This also aligns > >> > with H-extension architecture which does not virtualize M-mode. > >> > >> We already have mvendorid, marchid, and mipid in kvm_riscv_config. > > > > The mvendorid, marchid, and mipid are accessible via SBI BASE > > extension but not any other M-mode CSRs hence these are special. > > > >> > >> The virtualized M-mode is userspace+KVM. (KVM doesn't allow userspace > >> to configure most things now, but I think we'll have to change that when > >> getting ready for production.) > > > > The RISC-V architecture is not designed to virtualize M-mode > > and there is no practical use-case for virtualized M-mode hence > > WE WON'T BE SUPPORTING IT IN KVM RISC-V. > > Oh, sorry for the misunderstanding, I'll be clearer next time and talk > about implementation of the supervisor execution environment. > KVM+userspace provides SEE to the VS-mode, which is to VS-mode as what > M-mode is to S-mode, hence I called KVM+userspace a virtualized M-mode. > > > FYI, the KVM ARM64 does not virtualize EL3 either and it is > > already in production so please stop making random arguments > > for requiring virtualized M-mode for production. > > Yeah, I agree that we don't need it, I just had to provide so many > examples in the previous discussion that I went into quite niche cases. > > The increased flexibility is similarly useful for more important cases: > we can't avoid "virtualized M-mode"/SEE, but we don't have to completely > implement it in HS-mode. > > >> For general virtualization, we want to be able to configure the > >> following behavior for each exception that would go to the virtualized > >> M-mode: > >> 0) delegated to the guest > >> 1) implemented by userspace > >> 2-N) implementations by KVM (ideally zero or one) > >> > >> We can have medeleg, and another method to decide how to handle trapped > >> exceptions, but it probably makes more sense to have a per-exception > >> ONE_REG that sets how each exception behaves. > >> > > > > No pointing in discussing this further since we won't be supporting > > virtualized M-mode. > > I understand, back to the current series: > > I think we need to provide means with which userspace can control which > FWFT features are enabled, because KVM just exposes everything it know > and hardware supports right now: > 1) Migration between different systems would be hindered > 2) We couldn't add more FWFT features without breaking the SEE The FWFT features are defined to be backward compatible by SBI spec and these features only affect after Guest/VM has configured them. This means if a Guest/VM is not aware of SBI FWFT then it won't configure FWFT features and Guest/VM should still work fine. > > The (2) is similar to how we must set ".default_disabled = true" to > current FWFT, because KVM can't be changing the SEE for userspace. Not needed. We only keep an SBI extension disabled by default if it is being forwarded to KVM userspace (e.g. DBCN, SUSP) because KVM userspace needs to opt-in for such SBI extension handling. In other cases, the SBI extension is enabled by default and KVM userspace can explicitly disable it for Guest/VM using ONE_REG interface. This is true for ISA extensions as well. > > Do you want me to send a patch that inverts the default, to make all > future SBI extension start as disabled, so we can't easily repeat the > mistake in the future? No need to send any patch. Please focus on your current assignments. Regards, Anup
© 2016 - 2025 Red Hat, Inc.