Physical address hypercall ABI ("HVMv2")

[RFC PATCH 0/4] Physical address hypercall ABI ("HVMv2")

Posted by Teddy Astie 9 months, 3 weeks ago

In current HVM mode, when a hypercall references a structure in guest memory,
it is passed to the hypervisor as its "linear address" (e.g virtual address for
the x86 long mode).
One of the caveats is that this linear address (GVA) is generally not directly
usable by the Xen and needs to be translated from GVA to GPA then HPA. This
implies a complex and potentially expensive lookup of the pagetables inside the
guest. This can be significant, especially if the P2M cannot use efficiently
superpages (or with e.g XSA-304).

This proposal introduce a new mode where all addresses used for hypercalls are
GPADDR instead of GVADDR, therefore, looking up the HPA related to this GPA
only needs a P2M lookup and not looking through the inside-guest pagetables.

This mode is opt-in and must be enabled explicitely by the toolstack.

This is also mandatory for confidential-computing guests (e.g SEV) where the
guest pagetable are not visible to the hypervisor.

In a synthetic xtf-based hypercall benchmark (VCPUOP_get_runstate_info loop),
it gives broadly a 30% overhead reduction when tested on a AMD EPYC 9124.

This serie only implement support for x86 yet this ABI is also meaningful for
other architectures as well.
A separate patch for adding support in Linux is planned.

Teddy Astie (4):
  xen: Introduce physaddr_abi CDF flag
  x86/hvm: Consider phyaddr_abi when copying from/to guest memory
  x86/public: Expose physaddr_abi through Xen HVM CPUID leaf
  libxl: Add support for enabling physaddr_abi

 tools/include/libxl.h               |  5 +++++
 tools/libs/light/libxl_create.c     |  4 ++++
 tools/libs/light/libxl_types.idl    |  1 +
 tools/xl/xl_parse.c                 |  1 +
 xen/arch/x86/cpuid.c                |  3 +++
 xen/arch/x86/hvm/hvm.c              | 17 ++++++++++++++---
 xen/common/domain.c                 | 10 +++++++++-
 xen/include/public/arch-x86/cpuid.h |  2 ++
 xen/include/public/domctl.h         |  4 +++-
 xen/include/xen/sched.h             |  6 ++++++
 10 files changed, 48 insertions(+), 5 deletions(-)

-- 
2.47.2



Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC PATCH 0/4] Physical address hypercall ABI ("HVMv2")

Posted by Jan Beulich 9 months, 3 weeks ago

On 18.04.2025 16:18, Teddy Astie wrote:
> In current HVM mode, when a hypercall references a structure in guest memory,
> it is passed to the hypervisor as its "linear address" (e.g virtual address for
> the x86 long mode).
> One of the caveats is that this linear address (GVA) is generally not directly
> usable by the Xen and needs to be translated from GVA to GPA then HPA. This
> implies a complex and potentially expensive lookup of the pagetables inside the
> guest. This can be significant, especially if the P2M cannot use efficiently
> superpages (or with e.g XSA-304).
> 
> This proposal introduce a new mode where all addresses used for hypercalls are
> GPADDR instead of GVADDR, therefore, looking up the HPA related to this GPA
> only needs a P2M lookup and not looking through the inside-guest pagetables.
> 
> This mode is opt-in and must be enabled explicitely by the toolstack.

Which I view as a severe downside (leaving aside the PVH Dom0 aspect): This way
a guest needs to be converted all in one go. While doable, it'll be increasingly
risky with the size of the guest kernel code base.

A prior proposal of mine was to add an indicator to hypercall numbers (e.g. to
set the top bit there), to indicate which of the two models a particular
hypercall invocation uses.

Aiui Andrew had yet different (albeit also never spelled out) plans.

Jan

Re: [RFC PATCH 0/4] Physical address hypercall ABI ("HVMv2")

Posted by Julien Grall 9 months, 1 week ago

Hi,

On 22/04/2025 08:45, Jan Beulich wrote:
> On 18.04.2025 16:18, Teddy Astie wrote:
>> In current HVM mode, when a hypercall references a structure in guest memory,
>> it is passed to the hypervisor as its "linear address" (e.g virtual address for
>> the x86 long mode).
>> One of the caveats is that this linear address (GVA) is generally not directly
>> usable by the Xen and needs to be translated from GVA to GPA then HPA. This
>> implies a complex and potentially expensive lookup of the pagetables inside the
>> guest. This can be significant, especially if the P2M cannot use efficiently
>> superpages (or with e.g XSA-304).
>>
>> This proposal introduce a new mode where all addresses used for hypercalls are
>> GPADDR instead of GVADDR, therefore, looking up the HPA related to this GPA
>> only needs a P2M lookup and not looking through the inside-guest pagetables.
>>
>> This mode is opt-in and must be enabled explicitely by the toolstack.
> 
> Which I view as a severe downside (leaving aside the PVH Dom0 aspect): This way
> a guest needs to be converted all in one go. While doable, it'll be increasingly
> risky with the size of the guest kernel code base.

+1. It is not only the guest kernel, but also the firmware (UEFI, U-boot).

Furthermore, I don't think this can be easily adopted in public cloud 
where the admin for Xen and the guest will be different. So any 
indication of the ABI would have to come from the guest itself rather 
than the configuration.

Cheers,

-- 
Julien Grall

Re: [RFC PATCH 0/4] Physical address hypercall ABI ("HVMv2")

Posted by Teddy Astie 9 months, 1 week ago

Le 01/05/2025 à 13:14, Julien Grall a écrit :
> Hi,
> 
> On 22/04/2025 08:45, Jan Beulich wrote:
>> On 18.04.2025 16:18, Teddy Astie wrote:
>>> In current HVM mode, when a hypercall references a structure in guest 
>>> memory,
>>> it is passed to the hypervisor as its "linear address" (e.g virtual 
>>> address for
>>> the x86 long mode).
>>> One of the caveats is that this linear address (GVA) is generally not 
>>> directly
>>> usable by the Xen and needs to be translated from GVA to GPA then 
>>> HPA. This
>>> implies a complex and potentially expensive lookup of the pagetables 
>>> inside the
>>> guest. This can be significant, especially if the P2M cannot use 
>>> efficiently
>>> superpages (or with e.g XSA-304).
>>>
>>> This proposal introduce a new mode where all addresses used for 
>>> hypercalls are
>>> GPADDR instead of GVADDR, therefore, looking up the HPA related to 
>>> this GPA
>>> only needs a P2M lookup and not looking through the inside-guest 
>>> pagetables.
>>>
>>> This mode is opt-in and must be enabled explicitely by the toolstack.
>>
>> Which I view as a severe downside (leaving aside the PVH Dom0 aspect): 
>> This way
>> a guest needs to be converted all in one go. While doable, it'll be 
>> increasingly
>> risky with the size of the guest kernel code base.
> 
> +1. It is not only the guest kernel, but also the firmware (UEFI, U-boot).
> 
> Furthermore, I don't think this can be easily adopted in public cloud 
> where the admin for Xen and the guest will be different. So any 
> indication of the ABI would have to come from the guest itself rather 
> than the configuration.
> 

Makes sense, I am experimenting with a alternative design which requires 
setting the 30th bit (31th is already used for viridian) of the eax 
register to indicate the use of this new ABI. Also keeping a CPUID to 
indicate that the feature is supported by the hypervisor (thus no need 
to enable it in advance for a guest).

> Cheers,
> 

Teddy


Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech