[for 4.22 v5 02/18] xen/riscv: introduce VMID allocation and manegement

Oleksii Kurochko posted 18 patches 3 months, 3 weeks ago
There is a newer version of this series
[for 4.22 v5 02/18] xen/riscv: introduce VMID allocation and manegement
Posted by Oleksii Kurochko 3 months, 3 weeks ago
Current implementation is based on x86's way to allocate VMIDs:
  VMIDs partition the physical TLB. In the current implementation VMIDs are
  introduced to reduce the number of TLB flushes. Each time a guest-physical
  address space changes, instead of flushing the TLB, a new VMID is
  assigned. This reduces the number of TLB flushes to at most 1/#VMIDs.
  The biggest advantage is that hot parts of the hypervisor's code and data
  retain in the TLB.

  VMIDs are a hart-local resource.  As preemption of VMIDs is not possible,
  VMIDs are assigned in a round-robin scheme. To minimize the overhead of
  VMID invalidation, at the time of a TLB flush, VMIDs are tagged with a
  64-bit generation. Only on a generation overflow the code needs to
  invalidate all VMID information stored at the VCPUs with are run on the
  specific physical processor. When this overflow appears VMID usage is
  disabled to retain correctness.

Only minor changes are made compared to the x86 implementation.
These include using RISC-V-specific terminology, adding a check to ensure
the type used for storing the VMID has enough bits to hold VMIDLEN,
and introducing a new function vmidlen_detect() to clarify the VMIDLEN
value, rename stuff connected to VMID enable/disable to "VMID use
enable/disable".

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - Rename opt_vmid_use_enabled with opt_vmid to be in sync with command line
   option.
 - Invert the expression for data->used = ... and swap "dis" and "en". Also,
   invert usage of data->used elsewhere.
 - s/vcpu_vmid_flush_vcpu/vmid_flush_vcpu.
 - Add prototypes to asm/vmid.h which could be used outside vmid.c.
 - Update the comment in vmidlen_detect(): instead of Section 3.7 ->
   Section "Physical Memory Protection".
 - Move vmid_init() call to pre_gstage_init().
---
Changes in V4:
 - s/guest's virtual/guest-physical in the comment inside vmid.c
   and in commit message.
 - Drop x86-related numbers in the comment about "Sketch of the Implementation".
 - s/__read_only/__ro_after_init in declaration of opt_vmid_enabled.
 - s/hart_vmid_generation/generation.
 - Update vmidlen_detect() to work with unsigned int type for vmid_bits
   variable.
 - Drop old variable im vmdidlen_detetct, it seems like there is no any reason
   to restore old value of hgatp with no guest running on a hart yet.
 - Update the comment above local_hfence_gvma_all() in vmidlen_detect().
 - s/max_availalbe_bits/max_available_bits.
 - use BITS_PER_BYTE, instead of "<< 3".
 - Add BUILD_BUILD_BUG_ON() instead run-time check that an amount of set bits
   can be held in vmid_data->max_vmid.
 - Apply changes from the patch "x86/HVM: polish hvm_asid_init() a little" here
   (changes connected to g_disabled) with the following minor changes:
   Update the printk() message to "VMIDs use is...".
   Rename g_disabled to g_vmid_used.
 - Rename member 'disabled' of vmid_data structure to used.
 - Use gstage_mode to properly detect VMIDLEN.
---
Changes in V3:
 - Reimplemnt VMID allocation similar to what x86 has implemented.
---
Changes in V2:
 - New patch.
---
 xen/arch/riscv/Makefile             |   1 +
 xen/arch/riscv/include/asm/domain.h |   6 +
 xen/arch/riscv/include/asm/vmid.h   |  14 ++
 xen/arch/riscv/p2m.c                |   3 +
 xen/arch/riscv/vmid.c               | 193 ++++++++++++++++++++++++++++
 5 files changed, 217 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/vmid.h
 create mode 100644 xen/arch/riscv/vmid.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 264e265699..e2499210c8 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -17,6 +17,7 @@ obj-y += smpboot.o
 obj-y += stubs.o
 obj-y += time.o
 obj-y += traps.o
+obj-y += vmid.o
 obj-y += vm_event.o
 
 $(TARGET): $(TARGET)-syms
diff --git a/xen/arch/riscv/include/asm/domain.h b/xen/arch/riscv/include/asm/domain.h
index c3d965a559..aac1040658 100644
--- a/xen/arch/riscv/include/asm/domain.h
+++ b/xen/arch/riscv/include/asm/domain.h
@@ -5,6 +5,11 @@
 #include <xen/xmalloc.h>
 #include <public/hvm/params.h>
 
+struct vcpu_vmid {
+    uint64_t generation;
+    uint16_t vmid;
+};
+
 struct hvm_domain
 {
     uint64_t              params[HVM_NR_PARAMS];
@@ -14,6 +19,7 @@ struct arch_vcpu_io {
 };
 
 struct arch_vcpu {
+    struct vcpu_vmid vmid;
 };
 
 struct arch_domain {
diff --git a/xen/arch/riscv/include/asm/vmid.h b/xen/arch/riscv/include/asm/vmid.h
new file mode 100644
index 0000000000..1c500c4aff
--- /dev/null
+++ b/xen/arch/riscv/include/asm/vmid.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef ASM_RISCV_VMID_H
+#define ASM_RISCV_VMID_H
+
+struct vcpu;
+struct vcpu_vmid;
+
+void vmid_init(void);
+bool vmid_handle_vmenter(struct vcpu_vmid *vmid);
+void vmid_flush_vcpu(struct vcpu *v);
+void vmid_flush_hart(void);
+
+#endif /* ASM_RISCV_VMID_H */
diff --git a/xen/arch/riscv/p2m.c b/xen/arch/riscv/p2m.c
index 00fe676089..d8027a270f 100644
--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -8,6 +8,7 @@
 #include <asm/csr.h>
 #include <asm/flushtlb.h>
 #include <asm/riscv_encoding.h>
+#include <asm/vmid.h>
 
 unsigned char __ro_after_init gstage_mode;
 
@@ -93,4 +94,6 @@ static void __init gstage_mode_detect(void)
 void __init pre_gstage_init(void)
 {
     gstage_mode_detect();
+
+    vmid_init();
 }
diff --git a/xen/arch/riscv/vmid.c b/xen/arch/riscv/vmid.c
new file mode 100644
index 0000000000..885d177e9f
--- /dev/null
+++ b/xen/arch/riscv/vmid.c
@@ -0,0 +1,193 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/domain.h>
+#include <xen/init.h>
+#include <xen/sections.h>
+#include <xen/lib.h>
+#include <xen/param.h>
+#include <xen/percpu.h>
+
+#include <asm/atomic.h>
+#include <asm/csr.h>
+#include <asm/flushtlb.h>
+#include <asm/p2m.h>
+
+/* Xen command-line option to enable VMIDs */
+static bool __ro_after_init opt_vmid = true;
+boolean_param("vmid", opt_vmid);
+
+/*
+ * VMIDs partition the physical TLB. In the current implementation VMIDs are
+ * introduced to reduce the number of TLB flushes. Each time a guest-physical
+ * address space changes, instead of flushing the TLB, a new VMID is
+ * assigned. This reduces the number of TLB flushes to at most 1/#VMIDs.
+ * The biggest advantage is that hot parts of the hypervisor's code and data
+ * retain in the TLB.
+ *
+ * Sketch of the Implementation:
+ *
+ * VMIDs are a hart-local resource.  As preemption of VMIDs is not possible,
+ * VMIDs are assigned in a round-robin scheme. To minimize the overhead of
+ * VMID invalidation, at the time of a TLB flush, VMIDs are tagged with a
+ * 64-bit generation. Only on a generation overflow the code needs to
+ * invalidate all VMID information stored at the VCPUs with are run on the
+ * specific physical processor. When this overflow appears VMID usage is
+ * disabled to retain correctness.
+ */
+
+/* Per-Hart VMID management. */
+struct vmid_data {
+   uint64_t generation;
+   uint16_t next_vmid;
+   uint16_t max_vmid;
+   bool used;
+};
+
+static DEFINE_PER_CPU(struct vmid_data, vmid_data);
+
+static unsigned int vmidlen_detect(void)
+{
+    unsigned int vmid_bits;
+
+    /*
+     * According to the RISC-V Privileged Architecture Spec:
+     *   When MODE=Bare, guest physical addresses are equal to supervisor
+     *   physical addresses, and there is no further memory protection
+     *   for a guest virtual machine beyond the physical memory protection
+     *   scheme described in Section "Physical Memory Protection".
+     *   In this case, the remaining fields in hgatp must be set to zeros.
+     * Thereby it is necessary to set gstage_mode not equal to Bare.
+     */
+    ASSERT(gstage_mode != HGATP_MODE_OFF);
+    csr_write(CSR_HGATP,
+              MASK_INSR(gstage_mode, HGATP_MODE_MASK) | HGATP_VMID_MASK);
+    vmid_bits = MASK_EXTR(csr_read(CSR_HGATP), HGATP_VMID_MASK);
+    vmid_bits = flsl(vmid_bits);
+    csr_write(CSR_HGATP, _AC(0, UL));
+
+    /*
+     * From RISC-V spec:
+     *   Speculative executions of the address-translation algorithm behave as
+     *   non-speculative executions of the algorithm do, except that they must
+     *   not set the dirty bit for a PTE, they must not trigger an exception,
+     *   and they must not create address-translation cache entries if those
+     *   entries would have been invalidated by any SFENCE.VMA instruction
+     *   executed by the hart since the speculative execution of the algorithm
+     *   began.
+     *
+     * Also, despite of the fact here it is mentioned that when V=0 two-stage
+     * address translation is inactivated:
+     *   The current virtualization mode, denoted V, indicates whether the hart
+     *   is currently executing in a guest. When V=1, the hart is either in
+     *   virtual S-mode (VS-mode), or in virtual U-mode (VU-mode) atop a guest
+     *   OS running in VS-mode. When V=0, the hart is either in M-mode, in
+     *   HS-mode, or in U-mode atop an OS running in HS-mode. The
+     *   virtualization mode also indicates whether two-stage address
+     *   translation is active (V=1) or inactive (V=0).
+     * But on the same side, writing to hgatp register activates it:
+     *   The hgatp register is considered active for the purposes of
+     *   the address-translation algorithm unless the effective privilege mode
+     *   is U and hstatus.HU=0.
+     *
+     * Thereby it leaves some room for speculation even in this stage of boot,
+     * so it could be that we polluted local TLB so flush all guest TLB.
+     */
+    local_hfence_gvma_all();
+
+    return vmid_bits;
+}
+
+void vmid_init(void)
+{
+    static int8_t g_vmid_used = -1;
+
+    unsigned int vmid_len = vmidlen_detect();
+    struct vmid_data *data = &this_cpu(vmid_data);
+
+    BUILD_BUG_ON((HGATP_VMID_MASK >> HGATP_VMID_SHIFT) >
+                 (BIT((sizeof(data->max_vmid) * BITS_PER_BYTE), UL) - 1));
+
+    data->max_vmid = BIT(vmid_len, U) - 1;
+    data->used = opt_vmid && (vmid_len > 1);
+
+    if ( g_vmid_used < 0 )
+    {
+        g_vmid_used = data->used;
+        printk("VMIDs use is %sabled\n", data->used ? "en" : "dis");
+    }
+    else if ( g_vmid_used != data->used )
+        printk("CPU%u: VMIDs use is %sabled\n", smp_processor_id(),
+               data->used ? "en" : "dis");
+
+    /* Zero indicates 'invalid generation', so we start the count at one. */
+    data->generation = 1;
+
+    /* Zero indicates 'VMIDs use disabled', so we start the count at one. */
+    data->next_vmid = 1;
+}
+
+void vmid_flush_vcpu(struct vcpu *v)
+{
+    write_atomic(&v->arch.vmid.generation, 0);
+}
+
+void vmid_flush_hart(void)
+{
+    struct vmid_data *data = &this_cpu(vmid_data);
+
+    if ( !data->used )
+        return;
+
+    if ( likely(++data->generation != 0) )
+        return;
+
+    /*
+     * VMID generations are 64 bit.  Overflow of generations never happens.
+     * For safety, we simply disable ASIDs, so correctness is established; it
+     * only runs a bit slower.
+     */
+    printk("%s: VMID generation overrun. Disabling VMIDs.\n", __func__);
+    data->used = false;
+}
+
+bool vmid_handle_vmenter(struct vcpu_vmid *vmid)
+{
+    struct vmid_data *data = &this_cpu(vmid_data);
+
+    /* Test if VCPU has valid VMID. */
+    if ( read_atomic(&vmid->generation) == data->generation )
+        return 0;
+
+    /* If there are no free VMIDs, need to go to a new generation. */
+    if ( unlikely(data->next_vmid > data->max_vmid) )
+    {
+        vmid_flush_hart();
+        data->next_vmid = 1;
+        if ( !data->used )
+            goto disabled;
+    }
+
+    /* Now guaranteed to be a free VMID. */
+    vmid->vmid = data->next_vmid++;
+    write_atomic(&vmid->generation, data->generation);
+
+    /*
+     * When we assign VMID 1, flush all TLB entries as we are starting a new
+     * generation, and all old VMID allocations are now stale.
+     */
+    return (vmid->vmid == 1);
+
+ disabled:
+    vmid->vmid = 0;
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.51.0
Re: [for 4.22 v5 02/18] xen/riscv: introduce VMID allocation and manegement
Posted by Jan Beulich 3 months ago
On 20.10.2025 17:57, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/vmid.c
> @@ -0,0 +1,193 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include <xen/domain.h>
> +#include <xen/init.h>
> +#include <xen/sections.h>
> +#include <xen/lib.h>
> +#include <xen/param.h>
> +#include <xen/percpu.h>
> +
> +#include <asm/atomic.h>
> +#include <asm/csr.h>
> +#include <asm/flushtlb.h>
> +#include <asm/p2m.h>
> +
> +/* Xen command-line option to enable VMIDs */
> +static bool __ro_after_init opt_vmid = true;
> +boolean_param("vmid", opt_vmid);

Command line options, btw, want documenting in docs/misc/xen-command-line.pandoc.

> +/*
> + * VMIDs partition the physical TLB. In the current implementation VMIDs are
> + * introduced to reduce the number of TLB flushes. Each time a guest-physical
> + * address space changes, instead of flushing the TLB, a new VMID is
> + * assigned. This reduces the number of TLB flushes to at most 1/#VMIDs.
> + * The biggest advantage is that hot parts of the hypervisor's code and data
> + * retain in the TLB.
> + *
> + * Sketch of the Implementation:
> + *
> + * VMIDs are a hart-local resource.  As preemption of VMIDs is not possible,
> + * VMIDs are assigned in a round-robin scheme. To minimize the overhead of
> + * VMID invalidation, at the time of a TLB flush, VMIDs are tagged with a
> + * 64-bit generation. Only on a generation overflow the code needs to
> + * invalidate all VMID information stored at the VCPUs with are run on the
> + * specific physical processor. When this overflow appears VMID usage is
> + * disabled to retain correctness.
> + */
> +
> +/* Per-Hart VMID management. */
> +struct vmid_data {
> +   uint64_t generation;
> +   uint16_t next_vmid;
> +   uint16_t max_vmid;
> +   bool used;
> +};
> +
> +static DEFINE_PER_CPU(struct vmid_data, vmid_data);
> +
> +static unsigned int vmidlen_detect(void)
> +{
> +    unsigned int vmid_bits;
> +
> +    /*
> +     * According to the RISC-V Privileged Architecture Spec:
> +     *   When MODE=Bare, guest physical addresses are equal to supervisor
> +     *   physical addresses, and there is no further memory protection
> +     *   for a guest virtual machine beyond the physical memory protection
> +     *   scheme described in Section "Physical Memory Protection".
> +     *   In this case, the remaining fields in hgatp must be set to zeros.
> +     * Thereby it is necessary to set gstage_mode not equal to Bare.
> +     */
> +    ASSERT(gstage_mode != HGATP_MODE_OFF);
> +    csr_write(CSR_HGATP,
> +              MASK_INSR(gstage_mode, HGATP_MODE_MASK) | HGATP_VMID_MASK);
> +    vmid_bits = MASK_EXTR(csr_read(CSR_HGATP), HGATP_VMID_MASK);
> +    vmid_bits = flsl(vmid_bits);
> +    csr_write(CSR_HGATP, _AC(0, UL));
> +
> +    /*
> +     * From RISC-V spec:
> +     *   Speculative executions of the address-translation algorithm behave as
> +     *   non-speculative executions of the algorithm do, except that they must
> +     *   not set the dirty bit for a PTE, they must not trigger an exception,
> +     *   and they must not create address-translation cache entries if those
> +     *   entries would have been invalidated by any SFENCE.VMA instruction
> +     *   executed by the hart since the speculative execution of the algorithm
> +     *   began.
> +     *
> +     * Also, despite of the fact here it is mentioned that when V=0 two-stage
> +     * address translation is inactivated:
> +     *   The current virtualization mode, denoted V, indicates whether the hart
> +     *   is currently executing in a guest. When V=1, the hart is either in
> +     *   virtual S-mode (VS-mode), or in virtual U-mode (VU-mode) atop a guest
> +     *   OS running in VS-mode. When V=0, the hart is either in M-mode, in
> +     *   HS-mode, or in U-mode atop an OS running in HS-mode. The
> +     *   virtualization mode also indicates whether two-stage address
> +     *   translation is active (V=1) or inactive (V=0).
> +     * But on the same side, writing to hgatp register activates it:
> +     *   The hgatp register is considered active for the purposes of
> +     *   the address-translation algorithm unless the effective privilege mode
> +     *   is U and hstatus.HU=0.
> +     *
> +     * Thereby it leaves some room for speculation even in this stage of boot,
> +     * so it could be that we polluted local TLB so flush all guest TLB.
> +     */
> +    local_hfence_gvma_all();

That's a lot of redundancy with gstage_mode_detect(). The function call here
actually renders the one there redundant, afaict. Did you consider putting a
single instance at the end of it in pre_gstage_init()? Otherwise at least
don't repeat the comment here, but merely point at the other one?

> +    return vmid_bits;
> +}
> +
> +void vmid_init(void)

This (and its helper) isn't __init because you intend to also call it during
bringup of secondary processors?

> +{
> +    static int8_t g_vmid_used = -1;

Now that you're getting closer to the x86 original - __ro_after_init?

> +    unsigned int vmid_len = vmidlen_detect();
> +    struct vmid_data *data = &this_cpu(vmid_data);
> +
> +    BUILD_BUG_ON((HGATP_VMID_MASK >> HGATP_VMID_SHIFT) >
> +                 (BIT((sizeof(data->max_vmid) * BITS_PER_BYTE), UL) - 1));
> +
> +    data->max_vmid = BIT(vmid_len, U) - 1;
> +    data->used = opt_vmid && (vmid_len > 1);
> +
> +    if ( g_vmid_used < 0 )
> +    {
> +        g_vmid_used = data->used;
> +        printk("VMIDs use is %sabled\n", data->used ? "en" : "dis");
> +    }
> +    else if ( g_vmid_used != data->used )
> +        printk("CPU%u: VMIDs use is %sabled\n", smp_processor_id(),
> +               data->used ? "en" : "dis");
> +
> +    /* Zero indicates 'invalid generation', so we start the count at one. */
> +    data->generation = 1;
> +
> +    /* Zero indicates 'VMIDs use disabled', so we start the count at one. */
> +    data->next_vmid = 1;
> +}
> +
> +void vmid_flush_vcpu(struct vcpu *v)
> +{
> +    write_atomic(&v->arch.vmid.generation, 0);
> +}
> +
> +void vmid_flush_hart(void)
> +{
> +    struct vmid_data *data = &this_cpu(vmid_data);
> +
> +    if ( !data->used )
> +        return;
> +
> +    if ( likely(++data->generation != 0) )
> +        return;
> +
> +    /*
> +     * VMID generations are 64 bit.  Overflow of generations never happens.
> +     * For safety, we simply disable ASIDs, so correctness is established; it
> +     * only runs a bit slower.
> +     */
> +    printk("%s: VMID generation overrun. Disabling VMIDs.\n", __func__);

Is logging of the function name of any value here? Also, despite the x86
original havinbg it like this - generally no full stops please if log
messages. "VMID generation overrun; disabling VMIDs\n" would do.

> +bool vmid_handle_vmenter(struct vcpu_vmid *vmid)
> +{
> +    struct vmid_data *data = &this_cpu(vmid_data);
> +
> +    /* Test if VCPU has valid VMID. */

x86 has a ->disabled check up from here; why do you not check ->used?

> +    if ( read_atomic(&vmid->generation) == data->generation )
> +        return 0;
> +
> +    /* If there are no free VMIDs, need to go to a new generation. */
> +    if ( unlikely(data->next_vmid > data->max_vmid) )
> +    {
> +        vmid_flush_hart();
> +        data->next_vmid = 1;
> +        if ( !data->used )
> +            goto disabled;
> +    }
> +
> +    /* Now guaranteed to be a free VMID. */
> +    vmid->vmid = data->next_vmid++;
> +    write_atomic(&vmid->generation, data->generation);
> +
> +    /*
> +     * When we assign VMID 1, flush all TLB entries as we are starting a new
> +     * generation, and all old VMID allocations are now stale.
> +     */
> +    return (vmid->vmid == 1);

Minor: Parentheses aren't really needed here.

Jan
Re: [for 4.22 v5 02/18] xen/riscv: introduce VMID allocation and manegement
Posted by Oleksii Kurochko 2 months, 3 weeks ago
On 11/6/25 3:05 PM, Jan Beulich wrote:
> On 20.10.2025 17:57, Oleksii Kurochko wrote:
>
> +static unsigned int vmidlen_detect(void)
> +{
> +    unsigned int vmid_bits;
> +
> +    /*
> +     * According to the RISC-V Privileged Architecture Spec:
> +     *   When MODE=Bare, guest physical addresses are equal to supervisor
> +     *   physical addresses, and there is no further memory protection
> +     *   for a guest virtual machine beyond the physical memory protection
> +     *   scheme described in Section "Physical Memory Protection".
> +     *   In this case, the remaining fields in hgatp must be set to zeros.
> +     * Thereby it is necessary to set gstage_mode not equal to Bare.
> +     */
> +    ASSERT(gstage_mode != HGATP_MODE_OFF);
> +    csr_write(CSR_HGATP,
> +              MASK_INSR(gstage_mode, HGATP_MODE_MASK) | HGATP_VMID_MASK);
> +    vmid_bits = MASK_EXTR(csr_read(CSR_HGATP), HGATP_VMID_MASK);
> +    vmid_bits = flsl(vmid_bits);
> +    csr_write(CSR_HGATP, _AC(0, UL));
> +
> +    /*
> +     * From RISC-V spec:
> +     *   Speculative executions of the address-translation algorithm behave as
> +     *   non-speculative executions of the algorithm do, except that they must
> +     *   not set the dirty bit for a PTE, they must not trigger an exception,
> +     *   and they must not create address-translation cache entries if those
> +     *   entries would have been invalidated by any SFENCE.VMA instruction
> +     *   executed by the hart since the speculative execution of the algorithm
> +     *   began.
> +     *
> +     * Also, despite of the fact here it is mentioned that when V=0 two-stage
> +     * address translation is inactivated:
> +     *   The current virtualization mode, denoted V, indicates whether the hart
> +     *   is currently executing in a guest. When V=1, the hart is either in
> +     *   virtual S-mode (VS-mode), or in virtual U-mode (VU-mode) atop a guest
> +     *   OS running in VS-mode. When V=0, the hart is either in M-mode, in
> +     *   HS-mode, or in U-mode atop an OS running in HS-mode. The
> +     *   virtualization mode also indicates whether two-stage address
> +     *   translation is active (V=1) or inactive (V=0).
> +     * But on the same side, writing to hgatp register activates it:
> +     *   The hgatp register is considered active for the purposes of
> +     *   the address-translation algorithm unless the effective privilege mode
> +     *   is U and hstatus.HU=0.
> +     *
> +     * Thereby it leaves some room for speculation even in this stage of boot,
> +     * so it could be that we polluted local TLB so flush all guest TLB.
> +     */
> +    local_hfence_gvma_all();
> That's a lot of redundancy with gstage_mode_detect(). The function call here
> actually renders the one there redundant, afaict. Did you consider putting a
> single instance at the end of it in pre_gstage_init()? Otherwise at least
> don't repeat the comment here, but merely point at the other one?

Agree, it could be moved to the end of pre_gstage_init().


>> +    return vmid_bits;
>> +}
>> +
>> +void vmid_init(void)
> This (and its helper) isn't __init because you intend to also call it during
> bringup of secondary processors?

Yes, I wasn't able to find that VMIDLEN is guaranteed to be same for all
harts.

>> +    unsigned int vmid_len = vmidlen_detect();
>> +    struct vmid_data *data = &this_cpu(vmid_data);
>> +
>> +    BUILD_BUG_ON((HGATP_VMID_MASK >> HGATP_VMID_SHIFT) >
>> +                 (BIT((sizeof(data->max_vmid) * BITS_PER_BYTE), UL) - 1));
>> +
>> +    data->max_vmid = BIT(vmid_len, U) - 1;
>> +    data->used = opt_vmid && (vmid_len > 1);
>> +
>> +    if ( g_vmid_used < 0 )
>> +    {
>> +        g_vmid_used = data->used;
>> +        printk("VMIDs use is %sabled\n", data->used ? "en" : "dis");
>> +    }
>> +    else if ( g_vmid_used != data->used )
>> +        printk("CPU%u: VMIDs use is %sabled\n", smp_processor_id(),
>> +               data->used ? "en" : "dis");
>> +
>> +    /* Zero indicates 'invalid generation', so we start the count at one. */
>> +    data->generation = 1;
>> +
>> +    /* Zero indicates 'VMIDs use disabled', so we start the count at one. */
>> +    data->next_vmid = 1;
>> +}
>> +
>> +void vmid_flush_vcpu(struct vcpu *v)
>> +{
>> +    write_atomic(&v->arch.vmid.generation, 0);
>> +}
>> +
>> +void vmid_flush_hart(void)
>> +{
>> +    struct vmid_data *data = &this_cpu(vmid_data);
>> +
>> +    if ( !data->used )
>> +        return;
>> +
>> +    if ( likely(++data->generation != 0) )
>> +        return;
>> +
>> +    /*
>> +     * VMID generations are 64 bit.  Overflow of generations never happens.
>> +     * For safety, we simply disable ASIDs, so correctness is established; it
>> +     * only runs a bit slower.
>> +     */
>> +    printk("%s: VMID generation overrun. Disabling VMIDs.\n", __func__);
> Is logging of the function name of any value here?

Agree, there is no any sense for the logging of the function name.

>   Also, despite the x86
> original havinbg it like this - generally no full stops please if log
> messages. "VMID generation overrun; disabling VMIDs\n" would do.

Sure, I will drop it and will try to not add it in such cases. But could you
please remind (if I asked that before) me what is the reason why full stop
shouldn't be presented in such cases?

>> +bool vmid_handle_vmenter(struct vcpu_vmid *vmid)
>> +{
>> +    struct vmid_data *data = &this_cpu(vmid_data);
>> +
>> +    /* Test if VCPU has valid VMID. */
> x86 has a ->disabled check up from here; why do you not check ->used?

The x86 comment confused me, at first I thought the check was related to
erratum #170, but now I see that it might actually be useful here, so I'll add:
     if ( !data->used )
         goto disabled;

Thanks.

~ Oleksii
Re: [for 4.22 v5 02/18] xen/riscv: introduce VMID allocation and manegement
Posted by Jan Beulich 2 months, 3 weeks ago
On 14.11.2025 10:27, Oleksii Kurochko wrote:
> On 11/6/25 3:05 PM, Jan Beulich wrote:
>> On 20.10.2025 17:57, Oleksii Kurochko wrote:
>>> +void vmid_flush_hart(void)
>>> +{
>>> +    struct vmid_data *data = &this_cpu(vmid_data);
>>> +
>>> +    if ( !data->used )
>>> +        return;
>>> +
>>> +    if ( likely(++data->generation != 0) )
>>> +        return;
>>> +
>>> +    /*
>>> +     * VMID generations are 64 bit.  Overflow of generations never happens.
>>> +     * For safety, we simply disable ASIDs, so correctness is established; it
>>> +     * only runs a bit slower.
>>> +     */
>>> +    printk("%s: VMID generation overrun. Disabling VMIDs.\n", __func__);
>> Is logging of the function name of any value here?
> 
> Agree, there is no any sense for the logging of the function name.
> 
>>   Also, despite the x86
>> original havinbg it like this - generally no full stops please if log
>> messages. "VMID generation overrun; disabling VMIDs\n" would do.
> 
> Sure, I will drop it and will try to not add it in such cases. But could you
> please remind (if I asked that before) me what is the reason why full stop
> shouldn't be presented in such cases?

First: Consistency across the code base. Second: Meaningless characters
needlessly consume serial line bandwidth.

Jan