[PATCH 0/6] target/i386/mshv: use hv_vp_register_page for fast register access

Doru Blânzeanu posted 6 patches 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260428135053.251200-1-dblanzeanu@linux.microsoft.com
Maintainers: Magnus Kulke <magnuskulke@linux.microsoft.com>, Wei Liu <wei.liu@kernel.org>, Paolo Bonzini <pbonzini@redhat.com>, Zhao Liu <zhao1.liu@intel.com>
There is a newer version of this series
accel/mshv/mshv-all.c          |   3 +-
include/hw/hyperv/hvgdk_mini.h | 103 +++++++++++++++
target/i386/cpu.h              |   3 +
target/i386/mshv/mshv-cpu.c    | 221 ++++++++++++++++++++++++++++-----
4 files changed, 296 insertions(+), 34 deletions(-)
[PATCH 0/6] target/i386/mshv: use hv_vp_register_page for fast register access
Posted by Doru Blânzeanu 1 month ago
This series adds support for using the hypervisor's vp register page
in the mshv accelerator to optimize vcpu register access on mmio and pio
exits.

Currently, all register reads and write go through hypercalls (ioctls),
which adds overhead on every VM exit. The VP register page is a shared
memory page that the hypervisor populates with vcpu register state,
allowing Qemu to read and write registers directly without hypercalls.

The series is structured as follows:
1. Remove the duplicate `fetch_guest_state` function, consolidating
  register loading into `mshv_load_regs`.
2. Move `mshv_arch_init_vcpu` after vcpu creation so the vcpu fd is
  valid when we need it for mmap.
3. Define the `hv_vp_register_page` structure in `hvgdk_mini.h`, matching
  the layout used by the Linux kernel's mshv driver.
4. Set up the register page by mmapping the vcpu fd at init time. If the
  mmap fails, we fall back gracefully to the existing hypercall path.
5. Use the register page to read registers on VM exit. General purpose
  registers, RIP, RFLAGS, segment registers, and control registers
  (CR0, CR4, CR4, CR8, EFER) are read directly from the page. Registers
  not present on the page (TR, LDTR, GDTR, IDTR, CR2, APIC_BASE) are still
  fetched via hypercall.
6. Use register page to write registers on vmentry. GP registers,
  RIP, and RFLAGS are written to the page with the appropriate dirty
  bits set, avoiding the hypercall for the standard register store.

The register page is only used when it has been successfully mmapped and
the hypervisor has marked it as valid (`isvalid != 0`). Otherwise, the
existing hypercall-based path is used as a fallback.

Doru Blânzeanu (6):
  target/i386/mshv: remove duplicate function for reading vcpu registers
  accel/mshv: move vcpu arch specific initialization after vcpu creation
  include/hw/hyperv: add hv_vp_register_page struct definition
  target/i386/mshv: hv_vp_register_page setup for the vcpu
  target/i386/mshv: use the register page to get registers
  target/i386/mshv: use the register page to set registers

 accel/mshv/mshv-all.c          |   3 +-
 include/hw/hyperv/hvgdk_mini.h | 103 +++++++++++++++
 target/i386/cpu.h              |   3 +
 target/i386/mshv/mshv-cpu.c    | 221 ++++++++++++++++++++++++++++-----
 4 files changed, 296 insertions(+), 34 deletions(-)

-- 
2.53.0


Re: [PATCH 0/6] target/i386/mshv: use hv_vp_register_page for fast register access
Posted by Mohamed Mediouni 1 month ago

> On 28. Apr 2026, at 15:50, Doru Blânzeanu <dblanzeanu@linux.microsoft.com> wrote:
> 
> This series adds support for using the hypervisor's vp register page
> in the mshv accelerator to optimize vcpu register access on mmio and pio
> exits.
> 
> Currently, all register reads and write go through hypercalls (ioctls),
> which adds overhead on every VM exit. The VP register page is a shared
> memory page that the hypervisor populates with vcpu register state,
> allowing Qemu to read and write registers directly without hypercalls.
> 
> The series is structured as follows:
> 1. Remove the duplicate `fetch_guest_state` function, consolidating
>  register loading into `mshv_load_regs`.
> 2. Move `mshv_arch_init_vcpu` after vcpu creation so the vcpu fd is
>  valid when we need it for mmap.
> 3. Define the `hv_vp_register_page` structure in `hvgdk_mini.h`, matching
>  the layout used by the Linux kernel's mshv driver.
> 4. Set up the register page by mmapping the vcpu fd at init time. If the
>  mmap fails, we fall back gracefully to the existing hypercall path.
> 5. Use the register page to read registers on VM exit. General purpose
>  registers, RIP, RFLAGS, segment registers, and control registers
>  (CR0, CR4, CR4, CR8, EFER) are read directly from the page. Registers
>  not present on the page (TR, LDTR, GDTR, IDTR, CR2, APIC_BASE) are still
>  fetched via hypercall.
> 6. Use register page to write registers on vmentry. GP registers,
>  RIP, and RFLAGS are written to the page with the appropriate dirty
>  bits set, avoiding the hypercall for the standard register store.
> 
> The register page is only used when it has been successfully mmapped and
> the hypervisor has marked it as valid (`isvalid != 0`). Otherwise, the
> existing hypercall-based path is used as a fallback.

Hello,

Some overall comments after reading this series:

- bifurcated state sync is inevitable with not syncing the full
state for most MMIO/I/O port accesses. 

It’s wanted to skip some state. LDTR/GDTR are among the painful
ones and you’ll very much want to avoid syncing those if you can.

In the typical WHP emulation case, there’s 0 state used outside
of the shared register page and getting there made things so much
smoother than otherwise.

- but there’s an exception:

A pattern that causes problems:

    /* Advance RIP and update RAX */
    rip = info->header.rip + insn_len;
    rax = info->rax;

    reg_names[0] = HV_X64_REGISTER_RIP;
    reg_values[0] = rip;
    reg_names[1] = HV_X64_REGISTER_RAX;
    reg_values[1] = rax;

    ret = set_x64_registers(cpu, reg_names, reg_values);
    if (ret < 0) {
        error_report("Failed to set x64 registers");
        return -1;
    }

    cpu->accel->dirty = false;

When put together with hw/i386/vmport.c, which calls cpu_synchronize_state
on an I/O port read (thankfully not very frequent), you’ll get issues with
this as vmport does both read and set register values on its own.

In WHPX, was dealt with in: https://patchew.org/QEMU/20260422214225.2242-1-mohamed@unpredictable.fr/20260422214225.2242-12-mohamed@unpredictable.fr/ 
and you’ll probably want to do something similar.

I imagine that “cpu->accel->dirty = false;” was probably an attempt to
get things to boot when faced with that.

This adds a small constraint that if additional state will be fetched in
the future, then writes have to happen either before that fetch, or on the
now synced state instead of the partial view.