> On 28. Apr 2026, at 15:50, Doru Blânzeanu <dblanzeanu@linux.microsoft.com> wrote:
>
> This series adds support for using the hypervisor's vp register page
> in the mshv accelerator to optimize vcpu register access on mmio and pio
> exits.
>
> Currently, all register reads and write go through hypercalls (ioctls),
> which adds overhead on every VM exit. The VP register page is a shared
> memory page that the hypervisor populates with vcpu register state,
> allowing Qemu to read and write registers directly without hypercalls.
>
> The series is structured as follows:
> 1. Remove the duplicate `fetch_guest_state` function, consolidating
> register loading into `mshv_load_regs`.
> 2. Move `mshv_arch_init_vcpu` after vcpu creation so the vcpu fd is
> valid when we need it for mmap.
> 3. Define the `hv_vp_register_page` structure in `hvgdk_mini.h`, matching
> the layout used by the Linux kernel's mshv driver.
> 4. Set up the register page by mmapping the vcpu fd at init time. If the
> mmap fails, we fall back gracefully to the existing hypercall path.
> 5. Use the register page to read registers on VM exit. General purpose
> registers, RIP, RFLAGS, segment registers, and control registers
> (CR0, CR4, CR4, CR8, EFER) are read directly from the page. Registers
> not present on the page (TR, LDTR, GDTR, IDTR, CR2, APIC_BASE) are still
> fetched via hypercall.
> 6. Use register page to write registers on vmentry. GP registers,
> RIP, and RFLAGS are written to the page with the appropriate dirty
> bits set, avoiding the hypercall for the standard register store.
>
> The register page is only used when it has been successfully mmapped and
> the hypervisor has marked it as valid (`isvalid != 0`). Otherwise, the
> existing hypercall-based path is used as a fallback.
Hello,
Some overall comments after reading this series:
- bifurcated state sync is inevitable with not syncing the full
state for most MMIO/I/O port accesses.
It’s wanted to skip some state. LDTR/GDTR are among the painful
ones and you’ll very much want to avoid syncing those if you can.
In the typical WHP emulation case, there’s 0 state used outside
of the shared register page and getting there made things so much
smoother than otherwise.
- but there’s an exception:
A pattern that causes problems:
/* Advance RIP and update RAX */
rip = info->header.rip + insn_len;
rax = info->rax;
reg_names[0] = HV_X64_REGISTER_RIP;
reg_values[0] = rip;
reg_names[1] = HV_X64_REGISTER_RAX;
reg_values[1] = rax;
ret = set_x64_registers(cpu, reg_names, reg_values);
if (ret < 0) {
error_report("Failed to set x64 registers");
return -1;
}
cpu->accel->dirty = false;
When put together with hw/i386/vmport.c, which calls cpu_synchronize_state
on an I/O port read (thankfully not very frequent), you’ll get issues with
this as vmport does both read and set register values on its own.
In WHPX, was dealt with in: https://patchew.org/QEMU/20260422214225.2242-1-mohamed@unpredictable.fr/20260422214225.2242-12-mohamed@unpredictable.fr/
and you’ll probably want to do something similar.
I imagine that “cpu->accel->dirty = false;” was probably an attempt to
get things to boot when faced with that.
This adds a small constraint that if additional state will be fetched in
the future, then writes have to happen either before that fetch, or on the
now synced state instead of the partial view.