Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
state for a vCPU before it is first scheduled.
The function configures trap and interrupt delegation to VS-mode by
setting the appropriate bits in the hedeleg and hideleg registers,
initializes hstatus so that execution enters VS-mode when control is
passed to the guest, and restricts guest access to hardware performance
counters by initializing hcounteren, as unrestricted access would
require additional handling in Xen.
When the Smstateen and SSAIA extensions are available, access to AIA
CSRs and IMSIC guest interrupt files is enabled by setting the
corresponding bits in hstateen0, avoiding unnecessary traps into Xen
(note that SVSLCT(Supervisor Virtual Select) name is used intead of
CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
on it).
If the Svpbmt extension is supported, the PBMTE bit is set in
henvcfg to allow its use for VS-stage address translation. Guest
access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
hstateen0, as a guest may need to control certain characteristics of
the U-mode (VU-mode when V=1) execution environment.
For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
hstateen0), the written value is re-read from hardware and cached in
vcpu->arch to avoid divergence between the software state and the actual
CSR contents.
As hstatus is not part of struct arch_vcpu (it already resides in
struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
a uniform way to access hstatus and other guest CPU user registers.
This establishes a consistent and well-defined initial CSR state for
vCPUs prior to their first context switch.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
- As hstatus isn't a part of arch_vcpu structure (as it is already a part of
cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
hstatus and other CPU user regs.
- Sort hideleg bit setting by value. Drop a stray blank.
- Drop | when the first initialization of hcounteren and hennvcfg happen.
- Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
instead of open-coding it.
- Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
reason some bits of hedeleg and hideleg are r/o.
The similar patter is used for hstateen0 as some of the bits could be r/o.
- Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
- Drop local variables hstatus, hideleg and hedeleg as they aren't used
anymore.
---
xen/arch/riscv/cpufeature.c | 1 +
xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
xen/arch/riscv/include/asm/cpufeature.h | 1 +
xen/arch/riscv/include/asm/current.h | 2 +
xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
5 files changed, 79 insertions(+)
diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
index 02b68aeaa49f..03e27b037be0 100644
--- a/xen/arch/riscv/cpufeature.c
+++ b/xen/arch/riscv/cpufeature.c
@@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
RISCV_ISA_EXT_DATA(zbb),
RISCV_ISA_EXT_DATA(zbs),
RISCV_ISA_EXT_DATA(smaia),
+ RISCV_ISA_EXT_DATA(smstateen),
RISCV_ISA_EXT_DATA(ssaia),
RISCV_ISA_EXT_DATA(svade),
RISCV_ISA_EXT_DATA(svpbmt),
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 9c546267881b..3ae5fa3a8805 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,77 @@
#include <xen/sched.h>
#include <xen/vmap.h>
+#include <asm/cpufeature.h>
+#include <asm/csr.h>
+#include <asm/riscv_encoding.h>
+
+#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
+ BIT(CAUSE_FETCH_ACCESS, U) | \
+ BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
+ BIT(CAUSE_BREAKPOINT, U) | \
+ BIT(CAUSE_MISALIGNED_LOAD, U) | \
+ BIT(CAUSE_LOAD_ACCESS, U) | \
+ BIT(CAUSE_MISALIGNED_STORE, U) | \
+ BIT(CAUSE_STORE_ACCESS, U) | \
+ BIT(CAUSE_USER_ECALL, U) | \
+ BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
+ BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
+ BIT(CAUSE_STORE_PAGE_FAULT, U))
+
+#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
+
+static void vcpu_csr_init(struct vcpu *v)
+{
+ register_t hstateen0;
+
+ csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
+ v->arch.hedeleg = csr_read(CSR_HEDELEG);
+
+ vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
+
+ csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
+ v->arch.hideleg = csr_read(CSR_HIDELEG);
+
+ /*
+ * VS should access only the time counter directly.
+ * Everything else should trap.
+ */
+ v->arch.hcounteren = HCOUNTEREN_TM;
+
+ if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
+ {
+ csr_write(CSR_HENVCFG, ENVCFG_PBMTE);
+ v->arch.henvcfg = csr_read(CSR_HENVCFG);
+ }
+
+ if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
+ {
+ if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))
+ /*
+ * If the hypervisor extension is implemented, the same three
+ * bitsare defined also in hypervisor CSR hstateen0 but concern
+ * only the state potentially accessible to a virtual machine
+ * executing in privilege modes VS and VU:
+ * bit 60 CSRs siselect and sireg (really vsiselect and
+ * vsireg)
+ * bit 59 CSRs siph and sieh (RV32 only) and stopi (really
+ * vsiph, vsieh, and vstopi)
+ * bit 58 all state of IMSIC guest interrupt files, including
+ * CSR stopei (really vstopei)
+ * If one of these bits is zero in hstateen0, and the same bit is
+ * one in mstateen0, then an attempt to access the corresponding
+ * state from VS or VU-mode raises a virtual instruction exception.
+ */
+ hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
+
+ /* Allow guest to access CSR_ENVCFG */
+ hstateen0 |= SMSTATEEN0_HSENVCFG;
+
+ csr_write(CSR_HSTATEEN0, hstateen0);
+ v->arch.hstateen0 = csr_read(CSR_HSTATEEN0);
+ }
+}
+
static void continue_new_vcpu(struct vcpu *prev)
{
BUG_ON("unimplemented\n");
@@ -33,6 +104,8 @@ int arch_vcpu_create(struct vcpu *v)
v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
+ vcpu_csr_init(v);
+
/* Idle VCPUs don't need the rest of this setup */
if ( is_idle_vcpu(v) )
return rc;
diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
index b69616038888..ef02a3e26d2c 100644
--- a/xen/arch/riscv/include/asm/cpufeature.h
+++ b/xen/arch/riscv/include/asm/cpufeature.h
@@ -36,6 +36,7 @@ enum riscv_isa_ext_id {
RISCV_ISA_EXT_zbb,
RISCV_ISA_EXT_zbs,
RISCV_ISA_EXT_smaia,
+ RISCV_ISA_EXT_smstateen,
RISCV_ISA_EXT_ssaia,
RISCV_ISA_EXT_svade,
RISCV_ISA_EXT_svpbmt,
diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
index 58c9f1506b7c..5fbee8182caa 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -48,6 +48,8 @@ DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
#define get_cpu_current(cpu) per_cpu(curr_vcpu, cpu)
#define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
+#define vcpu_guest_cpu_user_regs(vcpu) \
+ (&(vcpu)->arch.cpu_info->guest_cpu_user_regs)
#define switch_stack_and_jump(stack, fn) do { \
asm volatile ( \
diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h b/xen/arch/riscv/include/asm/riscv_encoding.h
index 1f7e612366f8..dd15731a86fa 100644
--- a/xen/arch/riscv/include/asm/riscv_encoding.h
+++ b/xen/arch/riscv/include/asm/riscv_encoding.h
@@ -228,6 +228,8 @@
#define ENVCFG_CBIE_INV _UL(0x3)
#define ENVCFG_FIOM _UL(0x1)
+#define HCOUNTEREN_TM BIT(1, U)
+
/* ===== User-level CSRs ===== */
/* User Trap Setup (N-extension) */
--
2.52.0
Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
> Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
> state for a vCPU before it is first scheduled.
To me, "hypervisor-visible CSR state" sounds like some state of the
guest the hypervisor has view on. But to what I read, it's more
hypervisor state CSRs regarding what to intercept and various
virtualization behavior.
> The function configures trap and interrupt delegation to VS-mode by
> setting the appropriate bits in the hedeleg and hideleg registers,
> initializes hstatus so that execution enters VS-mode when control is
> passed to the guest, and restricts guest access to hardware performance
> counters by initializing hcounteren, as unrestricted access would
> require additional handling in Xen.
> When the Smstateen and SSAIA extensions are available, access to AIA
> CSRs and IMSIC guest interrupt files is enabled by setting the
> corresponding bits in hstateen0, avoiding unnecessary traps into Xen
> (note that SVSLCT(Supervisor Virtual Select) name is used intead of
> CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
> on it).
> If the Svpbmt extension is supported, the PBMTE bit is set in
> henvcfg to allow its use for VS-stage address translation. Guest
> access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
> hstateen0, as a guest may need to control certain characteristics of
> the U-mode (VU-mode when V=1) execution environment.
>
> For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
> hstateen0), the written value is re-read from hardware and cached in
> vcpu->arch to avoid divergence between the software state and the actual
> CSR contents.
>
AFAIU from RISC-V isa document, at least for some CSRs the read-only-0
bits are well-defined and means "can't be configured" due to not having
a meaning (e.g hedeleg defines as read-only "Environment call from
HS-mode" because guest can't be in a "actual" HS-mode).
> As hstatus is not part of struct arch_vcpu (it already resides in
> struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
> a uniform way to access hstatus and other guest CPU user registers.
>
> This establishes a consistent and well-defined initial CSR state for
> vCPUs prior to their first context switch.
>
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> ---
> Changes in v2:
> - As hstatus isn't a part of arch_vcpu structure (as it is already a part of
> cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
> hstatus and other CPU user regs.
Sounds like hstatus wants to be splitted from guest_cpu_user_regs (which
are supposed to be GPR ?).
> - Sort hideleg bit setting by value. Drop a stray blank.
> - Drop | when the first initialization of hcounteren and hennvcfg happen.
> - Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
> instead of open-coding it.
> - Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
> of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
> reason some bits of hedeleg and hideleg are r/o.
> The similar patter is used for hstateen0 as some of the bits could be r/o.
> - Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
> SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
> - Drop local variables hstatus, hideleg and hedeleg as they aren't used
> anymore.
> ---
> xen/arch/riscv/cpufeature.c | 1 +
> xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
> xen/arch/riscv/include/asm/cpufeature.h | 1 +
> xen/arch/riscv/include/asm/current.h | 2 +
> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
> 5 files changed, 79 insertions(+)
>
> diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
> index 02b68aeaa49f..03e27b037be0 100644
> --- a/xen/arch/riscv/cpufeature.c
> +++ b/xen/arch/riscv/cpufeature.c
> @@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
> RISCV_ISA_EXT_DATA(zbb),
> RISCV_ISA_EXT_DATA(zbs),
> RISCV_ISA_EXT_DATA(smaia),
> + RISCV_ISA_EXT_DATA(smstateen),
> RISCV_ISA_EXT_DATA(ssaia),
> RISCV_ISA_EXT_DATA(svade),
> RISCV_ISA_EXT_DATA(svpbmt),
> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
> index 9c546267881b..3ae5fa3a8805 100644
> --- a/xen/arch/riscv/domain.c
> +++ b/xen/arch/riscv/domain.c
> @@ -5,6 +5,77 @@
> #include <xen/sched.h>
> #include <xen/vmap.h>
>
> +#include <asm/cpufeature.h>
> +#include <asm/csr.h>
> +#include <asm/riscv_encoding.h>
> +
> +#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
> + BIT(CAUSE_FETCH_ACCESS, U) | \
> + BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
> + BIT(CAUSE_BREAKPOINT, U) | \
> + BIT(CAUSE_MISALIGNED_LOAD, U) | \
> + BIT(CAUSE_LOAD_ACCESS, U) | \
> + BIT(CAUSE_MISALIGNED_STORE, U) | \
> + BIT(CAUSE_STORE_ACCESS, U) | \
> + BIT(CAUSE_USER_ECALL, U) | \
> + BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
> + BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
> + BIT(CAUSE_STORE_PAGE_FAULT, U))
> +
> +#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
> +
> +static void vcpu_csr_init(struct vcpu *v)
> +{
> + register_t hstateen0;
> +
> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
> +
> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
> +
> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
> + v->arch.hideleg = csr_read(CSR_HIDELEG);
> +
To me, that feels odd to set machine CSR here. Especially if (I guess)
that we would update them anyway during context switches.
I think it would be better to have :
- vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
touch machine CSR
- context switching logic -> loads vcpu-specific machine CSR from vcpu
structure
We would have to make a context switch to enter the vcpu for the first
time; but that's to be expected.
With my proposal, we would also want to move the vcpu_csr_init()
invocation to the place we initialize the vcpu_arch structure rather
than the first time we schedule inside that vcpu.
That would also allow to take account of per-domain configuration if we
need to (e.g feature flags).
> + /*
> + * VS should access only the time counter directly.
> + * Everything else should trap.
> + */
> + v->arch.hcounteren = HCOUNTEREN_TM;
> +
> + if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_svpbmt) )
> + {
> + csr_write(CSR_HENVCFG, ENVCFG_PBMTE);
> + v->arch.henvcfg = csr_read(CSR_HENVCFG);
> + }
> +
> + if ( riscv_isa_extension_available(NULL, RISCV_ISA_EXT_smstateen) )
> + {
> + if (riscv_isa_extension_available(NULL, RISCV_ISA_EXT_ssaia))
> + /*
> + * If the hypervisor extension is implemented, the same three
> + * bitsare defined also in hypervisor CSR hstateen0 but concern
> + * only the state potentially accessible to a virtual machine
> + * executing in privilege modes VS and VU:
> + * bit 60 CSRs siselect and sireg (really vsiselect and
> + * vsireg)
> + * bit 59 CSRs siph and sieh (RV32 only) and stopi (really
> + * vsiph, vsieh, and vstopi)
> + * bit 58 all state of IMSIC guest interrupt files, including
> + * CSR stopei (really vstopei)
> + * If one of these bits is zero in hstateen0, and the same bit is
> + * one in mstateen0, then an attempt to access the corresponding
> + * state from VS or VU-mode raises a virtual instruction exception.
> + */
> + hstateen0 = SMSTATEEN0_AIA | SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT;
> +
> + /* Allow guest to access CSR_ENVCFG */
> + hstateen0 |= SMSTATEEN0_HSENVCFG;
> +
> + csr_write(CSR_HSTATEEN0, hstateen0);
> + v->arch.hstateen0 = csr_read(CSR_HSTATEEN0);
> + }
> +}
> +
> static void continue_new_vcpu(struct vcpu *prev)
> {
> BUG_ON("unimplemented\n");
> @@ -33,6 +104,8 @@ int arch_vcpu_create(struct vcpu *v)
> v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
> v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
>
> + vcpu_csr_init(v);
> +
> /* Idle VCPUs don't need the rest of this setup */
> if ( is_idle_vcpu(v) )
> return rc;
> diff --git a/xen/arch/riscv/include/asm/cpufeature.h b/xen/arch/riscv/include/asm/cpufeature.h
> index b69616038888..ef02a3e26d2c 100644
> --- a/xen/arch/riscv/include/asm/cpufeature.h
> +++ b/xen/arch/riscv/include/asm/cpufeature.h
> @@ -36,6 +36,7 @@ enum riscv_isa_ext_id {
> RISCV_ISA_EXT_zbb,
> RISCV_ISA_EXT_zbs,
> RISCV_ISA_EXT_smaia,
> + RISCV_ISA_EXT_smstateen,
> RISCV_ISA_EXT_ssaia,
> RISCV_ISA_EXT_svade,
> RISCV_ISA_EXT_svpbmt,
> diff --git a/xen/arch/riscv/include/asm/current.h b/xen/arch/riscv/include/asm/current.h
> index 58c9f1506b7c..5fbee8182caa 100644
> --- a/xen/arch/riscv/include/asm/current.h
> +++ b/xen/arch/riscv/include/asm/current.h
> @@ -48,6 +48,8 @@ DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
> #define get_cpu_current(cpu) per_cpu(curr_vcpu, cpu)
>
> #define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
> +#define vcpu_guest_cpu_user_regs(vcpu) \
> + (&(vcpu)->arch.cpu_info->guest_cpu_user_regs)
> > #define switch_stack_and_jump(stack, fn) do { \
> asm volatile ( \
> diff --git a/xen/arch/riscv/include/asm/riscv_encoding.h b/xen/arch/riscv/include/asm/riscv_encoding.h
> index 1f7e612366f8..dd15731a86fa 100644
> --- a/xen/arch/riscv/include/asm/riscv_encoding.h
> +++ b/xen/arch/riscv/include/asm/riscv_encoding.h
> @@ -228,6 +228,8 @@
> #define ENVCFG_CBIE_INV _UL(0x3)
> #define ENVCFG_FIOM _UL(0x1)
>
> +#define HCOUNTEREN_TM BIT(1, U)
> +
> /* ===== User-level CSRs ===== */
>
> /* User Trap Setup (N-extension) */
--
Teddy Astie | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
On 1/24/26 11:44 PM, Teddy Astie wrote:
> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>> Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
>> state for a vCPU before it is first scheduled.
> To me, "hypervisor-visible CSR state" sounds like some state of the
> guest the hypervisor has view on. But to what I read, it's more
> hypervisor state CSRs regarding what to intercept and various
> virtualization behavior.
I'll rephrase then to:
Introduce vcpu_csr_init() to initialise hypervisor CSRs that control
vCPU execution and virtualization behaviour before the vCPU is first
scheduled.
>
>> The function configures trap and interrupt delegation to VS-mode by
>> setting the appropriate bits in the hedeleg and hideleg registers,
>> initializes hstatus so that execution enters VS-mode when control is
>> passed to the guest, and restricts guest access to hardware performance
>> counters by initializing hcounteren, as unrestricted access would
>> require additional handling in Xen.
>> When the Smstateen and SSAIA extensions are available, access to AIA
>> CSRs and IMSIC guest interrupt files is enabled by setting the
>> corresponding bits in hstateen0, avoiding unnecessary traps into Xen
>> (note that SVSLCT(Supervisor Virtual Select) name is used intead of
>> CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
>> on it).
>> If the Svpbmt extension is supported, the PBMTE bit is set in
>> henvcfg to allow its use for VS-stage address translation. Guest
>> access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
>> hstateen0, as a guest may need to control certain characteristics of
>> the U-mode (VU-mode when V=1) execution environment.
>>
>> For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
>> hstateen0), the written value is re-read from hardware and cached in
>> vcpu->arch to avoid divergence between the software state and the actual
>> CSR contents.
>>
> AFAIU from RISC-V isa document, at least for some CSRs the read-only-0
> bits are well-defined and means "can't be configured" due to not having
> a meaning (e.g hedeleg defines as read-only "Environment call from
> HS-mode" because guest can't be in a "actual" HS-mode).
It was said about bits which hypervisor tries to set in the mentioned
registers and IIRC all of them could be r/o-0 as, for example, M-mode
can decide not to delegate them to HS-mode.
>
>> As hstatus is not part of struct arch_vcpu (it already resides in
>> struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
>> a uniform way to access hstatus and other guest CPU user registers.
>>
>> This establishes a consistent and well-defined initial CSR state for
>> vCPUs prior to their first context switch.
>>
>> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> ---
>> Changes in v2:
>> - As hstatus isn't a part of arch_vcpu structure (as it is already a part of
>> cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
>> hstatus and other CPU user regs.
> Sounds like hstatus wants to be splitted from guest_cpu_user_regs (which
> are supposed to be GPR ?).
Generally, agree. But hstatus want to be saved during a trap even before guest
is started and considering that we already have the code which stores it in
guest_cpu_user_regs structure and handle it during the trap, I've decided just
to drop hstatus from arch_vcpu.
>
>> - Sort hideleg bit setting by value. Drop a stray blank.
>> - Drop | when the first initialization of hcounteren and hennvcfg happen.
>> - Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
>> instead of open-coding it.
>> - Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
>> of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
>> reason some bits of hedeleg and hideleg are r/o.
>> The similar patter is used for hstateen0 as some of the bits could be r/o.
>> - Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
>> SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
>> - Drop local variables hstatus, hideleg and hedeleg as they aren't used
>> anymore.
>> ---
>> xen/arch/riscv/cpufeature.c | 1 +
>> xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
>> xen/arch/riscv/include/asm/cpufeature.h | 1 +
>> xen/arch/riscv/include/asm/current.h | 2 +
>> xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
>> 5 files changed, 79 insertions(+)
>>
>> diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
>> index 02b68aeaa49f..03e27b037be0 100644
>> --- a/xen/arch/riscv/cpufeature.c
>> +++ b/xen/arch/riscv/cpufeature.c
>> @@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
>> RISCV_ISA_EXT_DATA(zbb),
>> RISCV_ISA_EXT_DATA(zbs),
>> RISCV_ISA_EXT_DATA(smaia),
>> + RISCV_ISA_EXT_DATA(smstateen),
>> RISCV_ISA_EXT_DATA(ssaia),
>> RISCV_ISA_EXT_DATA(svade),
>> RISCV_ISA_EXT_DATA(svpbmt),
>> diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
>> index 9c546267881b..3ae5fa3a8805 100644
>> --- a/xen/arch/riscv/domain.c
>> +++ b/xen/arch/riscv/domain.c
>> @@ -5,6 +5,77 @@
>> #include <xen/sched.h>
>> #include <xen/vmap.h>
>>
>> +#include <asm/cpufeature.h>
>> +#include <asm/csr.h>
>> +#include <asm/riscv_encoding.h>
>> +
>> +#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
>> + BIT(CAUSE_FETCH_ACCESS, U) | \
>> + BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
>> + BIT(CAUSE_BREAKPOINT, U) | \
>> + BIT(CAUSE_MISALIGNED_LOAD, U) | \
>> + BIT(CAUSE_LOAD_ACCESS, U) | \
>> + BIT(CAUSE_MISALIGNED_STORE, U) | \
>> + BIT(CAUSE_STORE_ACCESS, U) | \
>> + BIT(CAUSE_USER_ECALL, U) | \
>> + BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
>> + BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
>> + BIT(CAUSE_STORE_PAGE_FAULT, U))
>> +
>> +#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
>> +
>> +static void vcpu_csr_init(struct vcpu *v)
>> +{
>> + register_t hstateen0;
>> +
>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>> +
>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>> +
>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>> +
> To me, that feels odd to set machine CSR here. Especially if (I guess)
> that we would update them anyway during context switches.
Yes, they will be updated anyway.
When this approach was initially suggested, I considered the case where
some code might attempt to use these bits before the context-switch logic
runs. In that situation, we could end up executing code that assumes the
feature is available (because the bit is set in|v->arch.some_reg|), only
to later discover that the corresponding CSR bit is read-only zero.
>
> I think it would be better to have :
> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
> touch machine CSR
> - context switching logic -> loads vcpu-specific machine CSR from vcpu
> structure
>
> We would have to make a context switch to enter the vcpu for the first
> time; but that's to be expected.
>
> With my proposal, we would also want to move the vcpu_csr_init()
> invocation to the place we initialize the vcpu_arch structure rather
> than the first time we schedule inside that vcpu.
I think I may be misunderstanding something here. vcpu_csr_init() is now
called from arch_vcpu_create(), which initialises architecture-specific
arch_vcpu fields. Am I missing something?
>
> That would also allow to take account of per-domain configuration if we
> need to (e.g feature flags).
Thanks.
~ Oleksii
On 24.01.2026 23:44, Teddy Astie wrote:
> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>> +static void vcpu_csr_init(struct vcpu *v)
>> +{
>> + register_t hstateen0;
>> +
>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>> +
>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>> +
>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>> +
>
> To me, that feels odd to set machine CSR here. Especially if (I guess)
> that we would update them anyway during context switches.
>
> I think it would be better to have :
> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
> touch machine CSR
> - context switching logic -> loads vcpu-specific machine CSR from vcpu
> structure
>
> We would have to make a context switch to enter the vcpu for the first
> time; but that's to be expected.
>
> With my proposal, we would also want to move the vcpu_csr_init()
> invocation to the place we initialize the vcpu_arch structure rather
> than the first time we schedule inside that vcpu.
Aiui the writes were put here to be able to cope with r/o-zero bits. Yet
indeed it can't be cone like this. While it may work for domains created
during boot, these CSRs would be changed under the feet of a running vCPU
when done this way for domain creation later at runtime.
Instead, as I think I had also suggested earlier on, the r/o-zero-ness of
bits in particular CSRs wants determining once during boot, and then that
mask wants using when setting up vCPU-s.
Jan
On 1/26/26 9:36 AM, Jan Beulich wrote:
> On 24.01.2026 23:44, Teddy Astie wrote:
>> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>>> +static void vcpu_csr_init(struct vcpu *v)
>>> +{
>>> + register_t hstateen0;
>>> +
>>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>>> +
>>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>>> +
>>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>>> +
>> To me, that feels odd to set machine CSR here. Especially if (I guess)
>> that we would update them anyway during context switches.
>>
>> I think it would be better to have :
>> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
>> touch machine CSR
>> - context switching logic -> loads vcpu-specific machine CSR from vcpu
>> structure
>>
>> We would have to make a context switch to enter the vcpu for the first
>> time; but that's to be expected.
>>
>> With my proposal, we would also want to move the vcpu_csr_init()
>> invocation to the place we initialize the vcpu_arch structure rather
>> than the first time we schedule inside that vcpu.
> Aiui the writes were put here to be able to cope with r/o-zero bits. Yet
> indeed it can't be cone like this. While it may work for domains created
> during boot, these CSRs would be changed under the feet of a running vCPU
> when done this way for domain creation later at runtime.
Why these CSRs would want to be changed when we have already running vCPU?
Even they will be changed what is an issue, they will be sync-ed during
context switch.
>
> Instead, as I think I had also suggested earlier on, the r/o-zero-ness of
> bits in particular CSRs wants determining once during boot, and then that
> mask wants using when setting up vCPU-s.
Somewhere even before create_domUs() will be called?
~ Oleksii
On 26.01.2026 10:47, Oleksii Kurochko wrote:
>
> On 1/26/26 9:36 AM, Jan Beulich wrote:
>> On 24.01.2026 23:44, Teddy Astie wrote:
>>> Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
>>>> +static void vcpu_csr_init(struct vcpu *v)
>>>> +{
>>>> + register_t hstateen0;
>>>> +
>>>> + csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
>>>> + v->arch.hedeleg = csr_read(CSR_HEDELEG);
>>>> +
>>>> + vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
>>>> +
>>>> + csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
>>>> + v->arch.hideleg = csr_read(CSR_HIDELEG);
>>>> +
>>> To me, that feels odd to set machine CSR here. Especially if (I guess)
>>> that we would update them anyway during context switches.
>>>
>>> I think it would be better to have :
>>> - vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
>>> touch machine CSR
>>> - context switching logic -> loads vcpu-specific machine CSR from vcpu
>>> structure
>>>
>>> We would have to make a context switch to enter the vcpu for the first
>>> time; but that's to be expected.
>>>
>>> With my proposal, we would also want to move the vcpu_csr_init()
>>> invocation to the place we initialize the vcpu_arch structure rather
>>> than the first time we schedule inside that vcpu.
>> Aiui the writes were put here to be able to cope with r/o-zero bits. Yet
>> indeed it can't be cone like this. While it may work for domains created
>> during boot, these CSRs would be changed under the feet of a running vCPU
>> when done this way for domain creation later at runtime.
>
> Why these CSRs would want to be changed when we have already running vCPU?
>
> Even they will be changed what is an issue, they will be sync-ed during
> context switch.
Which is too late (plus on ctxt-switch-out I assume they'll be synced from
HW to internal structures). The problem here is that a vCPU, having invoked
a hypercall resulting in a new vCPU (for another domain) to be created, will
observe a change of some of the CSRs controlling its own behavior.
>> Instead, as I think I had also suggested earlier on, the r/o-zero-ness of
>> bits in particular CSRs wants determining once during boot, and then that
>> mask wants using when setting up vCPU-s.
>
> Somewhere even before create_domUs() will be called?
Yes, before any domains (and in particular vCPU-s) are created. The idle
domain may be an exception here, and system domains (not having vCPU-s) may
also be unaffected and hence not depend on particular ordering.
Jan
© 2016 - 2026 Red Hat, Inc.