[PATCH 00/10] Support QEMU cpu models in MSHV accelerator

Magnus Kulke posted 10 patches 1 month, 4 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260211155410.203883-1-magnuskulke@linux.microsoft.com
Maintainers: Magnus Kulke <magnuskulke@linux.microsoft.com>, Wei Liu <wei.liu@kernel.org>, Paolo Bonzini <pbonzini@redhat.com>, Zhao Liu <zhao1.liu@intel.com>
There is a newer version of this series
MAINTAINERS                    |   4 +-
accel/mshv/mshv-all.c          |  35 ++++--
include/hw/hyperv/hvgdk_mini.h |  39 ++++--
include/hw/hyperv/hvhdk.h      | 199 ++++++++++++++++++++++++++++-
include/system/mshv.h          |   3 +
target/i386/cpu.c              |   8 ++
target/i386/mshv/mshv-cpu.c    | 221 ++++++++++++++++++++++++++-------
7 files changed, 438 insertions(+), 71 deletions(-)
[PATCH 00/10] Support QEMU cpu models in MSHV accelerator
Posted by Magnus Kulke 1 month, 4 weeks ago
Hey all,

In the current MSHV accelerator code passing CPU features via the -cpu
flag doesn't work as intended yet. When using the MSHV hypervisor we
either silently discard the specified model/features and leave it up
to the hypervisor to provide a sensible set of features or if the user
selects -cpu host, the hypervisor might refuse to create a partition.

This changeset introduces a more comprehensive support for passing
desired guest cpu features to the hypervisor. It's also a prerequisite
for Live Migration support, in which we have to roundtrip CPU State
explicitly

Known issues:

We will probably have to iterate a bit more on this, since recently
support for CET_U/CET_S xstate bits has been introduced in QEMU, which
doesn't harmonize with our current approach of configuring the
hypervisor with static responses to cpuid queries.

Drive-by fixes:

- tiny fix in MAINTAINERS
- adding packed attribute to inlined UAPI structs

best,

magnus

Magnus Kulke (10):
  MAINTAINERS: fix magnuskulke email-address
  include/hw/hyperv: add QEMU_PACKED to uapi structs
  accel/mshv: use mshv_create_partition_v2 payload
  target/i386/mshv: fix cpuid propagation bug
  target/i386/mshv: fix various cpuid traversal bugs
  target/i386/mshv: change cpuid mask to UINT32_MAX
  target/i386/mshv: set cpu model name on -cpu host
  target/i386: query mshv accel for supported cpuids
  target/i386/mshv: populate xsave area offsets
  target/i386/mshv: filter out CET bits in cpuid

 MAINTAINERS                    |   4 +-
 accel/mshv/mshv-all.c          |  35 ++++--
 include/hw/hyperv/hvgdk_mini.h |  39 ++++--
 include/hw/hyperv/hvhdk.h      | 199 ++++++++++++++++++++++++++++-
 include/system/mshv.h          |   3 +
 target/i386/cpu.c              |   8 ++
 target/i386/mshv/mshv-cpu.c    | 221 ++++++++++++++++++++++++++-------
 7 files changed, 438 insertions(+), 71 deletions(-)

-- 
2.34.1
Re: [PATCH 00/10] Support QEMU cpu models in MSHV accelerator
Posted by Paolo Bonzini 1 month, 1 week ago
On 2/11/26 16:54, Magnus Kulke wrote:
> Hey all,
> 
> In the current MSHV accelerator code passing CPU features via the -cpu
> flag doesn't work as intended yet. When using the MSHV hypervisor we
> either silently discard the specified model/features and leave it up
> to the hypervisor to provide a sensible set of features or if the user
> selects -cpu host, the hypervisor might refuse to create a partition.
> 
> This changeset introduces a more comprehensive support for passing
> desired guest cpu features to the hypervisor. It's also a prerequisite
> for Live Migration support, in which we have to roundtrip CPU State
> explicitly
> 
> Known issues:
> 
> We will probably have to iterate a bit more on this, since recently
> support for CET_U/CET_S xstate bits has been introduced in QEMU, which
> doesn't harmonize with our current approach of configuring the
> hypervisor with static responses to cpuid queries.

Hi Magnus,

I went back to reviewing this series.

More or less, using "-cpu host" should work because MSHV already runs in 
a partition.  Therefore, it should be safe to assume that whatever bits 
were allowed in the current partition's CPUID will also be allowed in 
the nested guest.

However, you still need to mask the features corresponding to MSRs that 
you do not save/restore; this includes for example TSC deadline timer, 
AMX (XFD), FRED, PMU, UMWAIT are the first few that came to mind.  Or 
alternatively, just add them to get/put_msrs.

Later on, we probably want to share some of the code to handle MSRs 
between Hyper-V and KVM.  Please add some comments about the hypercalls, 
since they are poorly documented, explaining how to find out which MSRs 
are supported by the hypervisor.

With respect to live migration, here are a few bits of states that 
should be migrated:

- the FPU registers (MshvFPU is dead code and can be removed; I missed 
it during my initial review).

- the PDPTRs.  That is probably okay (because you never set nor read 
env->pdptrs_valid) but please check if Hyper-V supports reading them.

- KVM also has support for migrating in the middle of an exception being 
delivered (for example if an EPT violation happens due to a write to the 
stack); this is handled with fields such as these:

         VMSTATE_UINT8(env.exception_pending, X86CPU),
         VMSTATE_UINT8(env.exception_injected, X86CPU),
         VMSTATE_UINT8(env.exception_has_payload, X86CPU),
         VMSTATE_UINT64(env.exception_payload, X86CPU),
         VMSTATE_INT32(env.exception_nr, X86CPU),
         VMSTATE_INT32(env.interrupt_injected, X86CPU),
         VMSTATE_UINT8(env.soft_interrupt, X86CPU),
         VMSTATE_UINT8(env.nmi_injected, X86CPU),
         VMSTATE_UINT8(env.nmi_pending, X86CPU),

please check how Hyper-V handles this situation so that it can be 
implemented in QEMU as well.

Thanks,

Paolo
Re: [PATCH 00/10] Support QEMU cpu models in MSHV accelerator
Posted by Magnus Kulke 1 month, 1 week ago
On Mon, Mar 02, 2026 at 07:34:21PM +0100, Paolo Bonzini wrote:
> Hi Magnus,
> 
> I went back to reviewing this series.
> 
> More or less, using "-cpu host" should work because MSHV already runs in a
> partition.  Therefore, it should be safe to assume that whatever bits were
> allowed in the current partition's CPUID will also be allowed in the nested
> guest.
> 
> However, you still need to mask the features corresponding to MSRs that you
> do not save/restore; this includes for example TSC deadline timer, AMX
> (XFD), FRED, PMU, UMWAIT are the first few that came to mind.  Or
> alternatively, just add them to get/put_msrs.
> 

Hey Paolo,

thanks for taking a look at this. I am currently staging follow up
patch-sets in which the MSR handling is reworked as part of the live
migration support. In those there will be more MSRs that are being
covered in a migration, using a hardcoded list in rust-vmm/mshv as
base, with some additional ones added:

https://docs.rs/crate/mshv-ioctls/0.6.7/source/src/ioctls/system.rs#363

in the future I think we want to have a MSHV_GET_SUPPORTED_MSRS ioctl
that we can query, similar to what's available for KVM.

There are "hv_partition_processor_features" that we query from the
hypervisor to filter out MSRs that are not available for a given
partition, e.g.

		uint64_t tsc_adjust_support:1;

> Later on, we probably want to share some of the code to handle MSRs between
> Hyper-V and KVM.  Please add some comments about the hypercalls, since they
> are poorly documented, explaining how to find out which MSRs are supported
> by the hypervisor.
> 

If the hypercalls are not documented, we probably want to fix it either
in the rust-vmm/mshv create (which at the moment provides authorative
headers (until we have moved everything to the mshv uapi headers), but
i'll double check.

> With respect to live migration, here are a few bits of states that should be
> migrated:
> 
> - the FPU registers (MshvFPU is dead code and can be removed; I missed it
> during my initial review).

yup, the FPU registers handling has also been reworked a bit to
accomodate XSAVE migration.

> 
> - the PDPTRs.  That is probably okay (because you never set nor read
> env->pdptrs_valid) but please check if Hyper-V supports reading them.
> 

I'll try to find that out. Haven't stumbled over this so far, but I
understand it's relevant for 32bit PAE guest, which we probably haven't
tested thoroughly yet.

> - KVM also has support for migrating in the middle of an exception being
> delivered (for example if an EPT violation happens due to a write to the
> stack); this is handled with fields such as these:
> 
>         VMSTATE_UINT8(env.exception_pending, X86CPU),
>         VMSTATE_UINT8(env.exception_injected, X86CPU),
>         VMSTATE_UINT8(env.exception_has_payload, X86CPU),
>         VMSTATE_UINT64(env.exception_payload, X86CPU),
>         VMSTATE_INT32(env.exception_nr, X86CPU),
>         VMSTATE_INT32(env.interrupt_injected, X86CPU),
>         VMSTATE_UINT8(env.soft_interrupt, X86CPU),
>         VMSTATE_UINT8(env.nmi_injected, X86CPU),
>         VMSTATE_UINT8(env.nmi_pending, X86CPU),
> 
> please check how Hyper-V handles this situation so that it can be
> implemented in QEMU as well.
> 

Those are covered in MSHV's "vCPU Events", I think they map quite
cleanly to the QEMU representation:

https://docs.rs/mshv-bindings/0.6.7/src/mshv_bindings/x86_64/regs.rs.html#404

thanks,

magnus
Re: [PATCH 00/10] Support QEMU cpu models in MSHV accelerator
Posted by Paolo Bonzini 1 month, 1 week ago
On Tue, Mar 3, 2026 at 2:31 PM Magnus Kulke
<magnuskulke@linux.microsoft.com> wrote:
>
> On Mon, Mar 02, 2026 at 07:34:21PM +0100, Paolo Bonzini wrote:
> > Hi Magnus,
> >
> > I went back to reviewing this series.
> >
> > More or less, using "-cpu host" should work because MSHV already runs in a
> > partition.  Therefore, it should be safe to assume that whatever bits were
> > allowed in the current partition's CPUID will also be allowed in the nested
> > guest.
> >
> > However, you still need to mask the features corresponding to MSRs that you
> > do not save/restore; this includes for example TSC deadline timer, AMX
> > (XFD), FRED, PMU, UMWAIT are the first few that came to mind.  Or
> > alternatively, just add them to get/put_msrs.
> >
>
> Hey Paolo,
>
> thanks for taking a look at this. I am currently staging follow up
> patch-sets in which the MSR handling is reworked as part of the live
> migration support.

Ok, let's start from there so that we don't have to think about
filtering out features because they're not migrated.

Thanks for the extra information, I'll study the rust-vmm docs.

> in the future I think we want to have a MSHV_GET_SUPPORTED_MSRS ioctl
> that we can query, similar to what's available for KVM.

That may not be needed if we can rely on the parent partition's CPUID
(I changed my mind and I think we can, because QEMU already filters
out CPUID features it doesn't know about).

We may want to share the knowledge of MSRs between KVM and MSHV, but
that's a different story.

Paolo

> There are "hv_partition_processor_features" that we query from the
> hypervisor to filter out MSRs that are not available for a given
> partition, e.g.
>
>                 uint64_t tsc_adjust_support:1;
>
> > Later on, we probably want to share some of the code to handle MSRs between
> > Hyper-V and KVM.  Please add some comments about the hypercalls, since they
> > are poorly documented, explaining how to find out which MSRs are supported
> > by the hypervisor.
> >
>
> If the hypercalls are not documented, we probably want to fix it either
> in the rust-vmm/mshv create (which at the moment provides authorative
> headers (until we have moved everything to the mshv uapi headers), but
> i'll double check.
>
> > With respect to live migration, here are a few bits of states that should be
> > migrated:
> >
> > - the FPU registers (MshvFPU is dead code and can be removed; I missed it
> > during my initial review).
>
> yup, the FPU registers handling has also been reworked a bit to
> accomodate XSAVE migration.
>
> >
> > - the PDPTRs.  That is probably okay (because you never set nor read
> > env->pdptrs_valid) but please check if Hyper-V supports reading them.
> >
>
> I'll try to find that out. Haven't stumbled over this so far, but I
> understand it's relevant for 32bit PAE guest, which we probably haven't
> tested thoroughly yet.
>
> > - KVM also has support for migrating in the middle of an exception being
> > delivered (for example if an EPT violation happens due to a write to the
> > stack); this is handled with fields such as these:
> >
> >         VMSTATE_UINT8(env.exception_pending, X86CPU),
> >         VMSTATE_UINT8(env.exception_injected, X86CPU),
> >         VMSTATE_UINT8(env.exception_has_payload, X86CPU),
> >         VMSTATE_UINT64(env.exception_payload, X86CPU),
> >         VMSTATE_INT32(env.exception_nr, X86CPU),
> >         VMSTATE_INT32(env.interrupt_injected, X86CPU),
> >         VMSTATE_UINT8(env.soft_interrupt, X86CPU),
> >         VMSTATE_UINT8(env.nmi_injected, X86CPU),
> >         VMSTATE_UINT8(env.nmi_pending, X86CPU),
> >
> > please check how Hyper-V handles this situation so that it can be
> > implemented in QEMU as well.
> >
>
> Those are covered in MSHV's "vCPU Events", I think they map quite
> cleanly to the QEMU representation:
>
> https://docs.rs/mshv-bindings/0.6.7/src/mshv_bindings/x86_64/regs.rs.html#404
>
> thanks,
>
> magnus
>