tools/include/libxl.h | 5 +++++ tools/libs/light/libxl_create.c | 4 ++++ tools/libs/light/libxl_types.idl | 1 + tools/xl/xl_parse.c | 1 + xen/arch/x86/cpuid.c | 3 +++ xen/arch/x86/hvm/hvm.c | 17 ++++++++++++++--- xen/common/domain.c | 10 +++++++++- xen/include/public/arch-x86/cpuid.h | 2 ++ xen/include/public/domctl.h | 4 +++- xen/include/xen/sched.h | 6 ++++++ 10 files changed, 48 insertions(+), 5 deletions(-)
In current HVM mode, when a hypercall references a structure in guest memory, it is passed to the hypervisor as its "linear address" (e.g virtual address for the x86 long mode). One of the caveats is that this linear address (GVA) is generally not directly usable by the Xen and needs to be translated from GVA to GPA then HPA. This implies a complex and potentially expensive lookup of the pagetables inside the guest. This can be significant, especially if the P2M cannot use efficiently superpages (or with e.g XSA-304). This proposal introduce a new mode where all addresses used for hypercalls are GPADDR instead of GVADDR, therefore, looking up the HPA related to this GPA only needs a P2M lookup and not looking through the inside-guest pagetables. This mode is opt-in and must be enabled explicitely by the toolstack. This is also mandatory for confidential-computing guests (e.g SEV) where the guest pagetable are not visible to the hypervisor. In a synthetic xtf-based hypercall benchmark (VCPUOP_get_runstate_info loop), it gives broadly a 30% overhead reduction when tested on a AMD EPYC 9124. This serie only implement support for x86 yet this ABI is also meaningful for other architectures as well. A separate patch for adding support in Linux is planned. Teddy Astie (4): xen: Introduce physaddr_abi CDF flag x86/hvm: Consider phyaddr_abi when copying from/to guest memory x86/public: Expose physaddr_abi through Xen HVM CPUID leaf libxl: Add support for enabling physaddr_abi tools/include/libxl.h | 5 +++++ tools/libs/light/libxl_create.c | 4 ++++ tools/libs/light/libxl_types.idl | 1 + tools/xl/xl_parse.c | 1 + xen/arch/x86/cpuid.c | 3 +++ xen/arch/x86/hvm/hvm.c | 17 ++++++++++++++--- xen/common/domain.c | 10 +++++++++- xen/include/public/arch-x86/cpuid.h | 2 ++ xen/include/public/domctl.h | 4 +++- xen/include/xen/sched.h | 6 ++++++ 10 files changed, 48 insertions(+), 5 deletions(-) -- 2.47.2 Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
On 18.04.2025 16:18, Teddy Astie wrote: > In current HVM mode, when a hypercall references a structure in guest memory, > it is passed to the hypervisor as its "linear address" (e.g virtual address for > the x86 long mode). > One of the caveats is that this linear address (GVA) is generally not directly > usable by the Xen and needs to be translated from GVA to GPA then HPA. This > implies a complex and potentially expensive lookup of the pagetables inside the > guest. This can be significant, especially if the P2M cannot use efficiently > superpages (or with e.g XSA-304). > > This proposal introduce a new mode where all addresses used for hypercalls are > GPADDR instead of GVADDR, therefore, looking up the HPA related to this GPA > only needs a P2M lookup and not looking through the inside-guest pagetables. > > This mode is opt-in and must be enabled explicitely by the toolstack. Which I view as a severe downside (leaving aside the PVH Dom0 aspect): This way a guest needs to be converted all in one go. While doable, it'll be increasingly risky with the size of the guest kernel code base. A prior proposal of mine was to add an indicator to hypercall numbers (e.g. to set the top bit there), to indicate which of the two models a particular hypercall invocation uses. Aiui Andrew had yet different (albeit also never spelled out) plans. Jan
Hi, On 22/04/2025 08:45, Jan Beulich wrote: > On 18.04.2025 16:18, Teddy Astie wrote: >> In current HVM mode, when a hypercall references a structure in guest memory, >> it is passed to the hypervisor as its "linear address" (e.g virtual address for >> the x86 long mode). >> One of the caveats is that this linear address (GVA) is generally not directly >> usable by the Xen and needs to be translated from GVA to GPA then HPA. This >> implies a complex and potentially expensive lookup of the pagetables inside the >> guest. This can be significant, especially if the P2M cannot use efficiently >> superpages (or with e.g XSA-304). >> >> This proposal introduce a new mode where all addresses used for hypercalls are >> GPADDR instead of GVADDR, therefore, looking up the HPA related to this GPA >> only needs a P2M lookup and not looking through the inside-guest pagetables. >> >> This mode is opt-in and must be enabled explicitely by the toolstack. > > Which I view as a severe downside (leaving aside the PVH Dom0 aspect): This way > a guest needs to be converted all in one go. While doable, it'll be increasingly > risky with the size of the guest kernel code base. +1. It is not only the guest kernel, but also the firmware (UEFI, U-boot). Furthermore, I don't think this can be easily adopted in public cloud where the admin for Xen and the guest will be different. So any indication of the ABI would have to come from the guest itself rather than the configuration. Cheers, -- Julien Grall
Le 01/05/2025 à 13:14, Julien Grall a écrit : > Hi, > > On 22/04/2025 08:45, Jan Beulich wrote: >> On 18.04.2025 16:18, Teddy Astie wrote: >>> In current HVM mode, when a hypercall references a structure in guest >>> memory, >>> it is passed to the hypervisor as its "linear address" (e.g virtual >>> address for >>> the x86 long mode). >>> One of the caveats is that this linear address (GVA) is generally not >>> directly >>> usable by the Xen and needs to be translated from GVA to GPA then >>> HPA. This >>> implies a complex and potentially expensive lookup of the pagetables >>> inside the >>> guest. This can be significant, especially if the P2M cannot use >>> efficiently >>> superpages (or with e.g XSA-304). >>> >>> This proposal introduce a new mode where all addresses used for >>> hypercalls are >>> GPADDR instead of GVADDR, therefore, looking up the HPA related to >>> this GPA >>> only needs a P2M lookup and not looking through the inside-guest >>> pagetables. >>> >>> This mode is opt-in and must be enabled explicitely by the toolstack. >> >> Which I view as a severe downside (leaving aside the PVH Dom0 aspect): >> This way >> a guest needs to be converted all in one go. While doable, it'll be >> increasingly >> risky with the size of the guest kernel code base. > > +1. It is not only the guest kernel, but also the firmware (UEFI, U-boot). > > Furthermore, I don't think this can be easily adopted in public cloud > where the admin for Xen and the guest will be different. So any > indication of the ABI would have to come from the guest itself rather > than the configuration. > Makes sense, I am experimenting with a alternative design which requires setting the 30th bit (31th is already used for viridian) of the eax register to indicate the use of this new ABI. Also keeping a CPUID to indicate that the feature is supported by the hypervisor (thus no need to enable it in advance for a guest). > Cheers, > Teddy Teddy Astie | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
© 2016 - 2025 Red Hat, Inc.