[XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations

Grygorii Strashko posted 1 patch 2 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20251114140117.270461-1-grygorii._5Fstrashko@epam.com
There is a newer version of this series
xen/arch/x86/Makefile                   |  4 --
xen/arch/x86/include/asm/guest_access.h | 78 ++++++++++++++++++-------
xen/arch/x86/pv/Makefile                |  4 ++
xen/arch/x86/{ => pv}/usercopy.c        |  0
4 files changed, 62 insertions(+), 24 deletions(-)
rename xen/arch/x86/{ => pv}/usercopy.c (100%)
[XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Grygorii Strashko 2 months, 3 weeks ago
From: Grygorii Strashko <grygorii_strashko@epam.com>

Xen uses below pattern for raw_x_guest() functions:

define raw_copy_to_guest(dst, src, len)        \
    (is_hvm_vcpu(current) ?                     \
     copy_to_user_hvm((dst), (src), (len)) :    \
     copy_to_guest_pv(dst, src, len))

This pattern works depending on CONFIG_PV/CONFIG_HVM as:
- PV=y and HVM=y
  Proper guest access function is selected depending on domain type.
- PV=y and HVM=n
  Only PV domains are possible. is_hvm_domain/vcpu() will constify to "false"
  and compiler will optimize code and skip HVM specific part.
- PV=n and HVM=y
  Only HVM domains are possible. is_hvm_domain/vcpu() will not be constified.
  No PV specific code will be optimized by compiler.
- PV=n and HVM=n
  No guests should possible. The code will still follow PV path.

Rework raw_x_guest() code to use static inline functions which account for
above PV/HVM possible configurations with main intention to optimize code
for (PV=n and HVM=y) case.

For the case (PV=n and HVM=n) return "len" value indicating a failure (no
guests should be possible in this case, which means no access to guest
memory should ever happen).

Finally move arch/x86/usercopy.c into arch/x86/pv/usercopy.c to use it only
with PV=y.

The measured (bloat-o-meter) improvement for (PV=n and HVM=y) case is:
  add/remove: 3/8 grow/shrink: 3/89 up/down: 1018/-12087 (-11069)
  Total: Before=1937280, After=1926211, chg -0.57%

[teddy.astie@vates.tech: Suggested to use static inline functions vs
macro combinations]
Suggested-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
---
changes in v4:
- move usercopy.c into arch/x86/pv/
- rework to always dynamically check for HVM vcpu(domain) by using is_hvm_vcpu()
  as requested by Jan Beulich

changes in v3:
- add raw_use_hvm_access() wrapper

changes in v2:
- use static inline functions instead of macro combinations

 xen/arch/x86/Makefile                   |  4 --
 xen/arch/x86/include/asm/guest_access.h | 78 ++++++++++++++++++-------
 xen/arch/x86/pv/Makefile                |  4 ++
 xen/arch/x86/{ => pv}/usercopy.c        |  0
 4 files changed, 62 insertions(+), 24 deletions(-)
 rename xen/arch/x86/{ => pv}/usercopy.c (100%)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 407571c510e1..6e2b17471719 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -71,7 +71,6 @@ obj-y += time.o
 obj-y += traps-setup.o
 obj-y += traps.o
 obj-$(CONFIG_INTEL) += tsx.o
-obj-y += usercopy.o
 obj-y += x86_emulate.o
 obj-$(CONFIG_TBOOT) += tboot.o
 obj-y += hpet.o
@@ -92,9 +91,6 @@ hostprogs-y += efi/mkreloc
 
 $(obj)/efi/mkreloc: HOSTCFLAGS += -I$(srctree)/include
 
-# Allows usercopy.c to include itself
-$(obj)/usercopy.o: CFLAGS-y += -iquote .
-
 ifneq ($(CONFIG_HVM),y)
 $(obj)/x86_emulate.o: CFLAGS-y += -Wno-unused-label
 endif
diff --git a/xen/arch/x86/include/asm/guest_access.h b/xen/arch/x86/include/asm/guest_access.h
index 69716c8b41bb..f0e56b112e14 100644
--- a/xen/arch/x86/include/asm/guest_access.h
+++ b/xen/arch/x86/include/asm/guest_access.h
@@ -13,26 +13,64 @@
 #include <asm/hvm/guest_access.h>
 
 /* Raw access functions: no type checking. */
-#define raw_copy_to_guest(dst, src, len)        \
-    (is_hvm_vcpu(current) ?                     \
-     copy_to_user_hvm((dst), (src), (len)) :    \
-     copy_to_guest_pv(dst, src, len))
-#define raw_copy_from_guest(dst, src, len)      \
-    (is_hvm_vcpu(current) ?                     \
-     copy_from_user_hvm((dst), (src), (len)) :  \
-     copy_from_guest_pv(dst, src, len))
-#define raw_clear_guest(dst,  len)              \
-    (is_hvm_vcpu(current) ?                     \
-     clear_user_hvm((dst), (len)) :             \
-     clear_guest_pv(dst, len))
-#define __raw_copy_to_guest(dst, src, len)      \
-    (is_hvm_vcpu(current) ?                     \
-     copy_to_user_hvm((dst), (src), (len)) :    \
-     __copy_to_guest_pv(dst, src, len))
-#define __raw_copy_from_guest(dst, src, len)    \
-    (is_hvm_vcpu(current) ?                     \
-     copy_from_user_hvm((dst), (src), (len)) :  \
-     __copy_from_guest_pv(dst, src, len))
+static inline unsigned int raw_copy_to_guest(void *dst, const void *src,
+                                             unsigned int len)
+{
+    if ( is_hvm_vcpu(current) )
+        return copy_to_user_hvm(dst, src, len);
+
+    if ( !IS_ENABLED(CONFIG_PV) )
+        return len;
+
+    return copy_to_guest_pv(dst, src, len);
+}
+
+static inline unsigned int raw_copy_from_guest(void *dst, const void *src,
+                                               unsigned int len)
+{
+    if ( is_hvm_vcpu(current) )
+        return copy_from_user_hvm(dst, src, len);
+
+    if ( !IS_ENABLED(CONFIG_PV) )
+        return len;
+
+    return copy_from_guest_pv(dst, src, len);
+}
+
+static inline unsigned int raw_clear_guest(void *dst, unsigned int len)
+{
+    if ( is_hvm_vcpu(current) )
+        return clear_user_hvm(dst, len);
+
+    if ( !IS_ENABLED(CONFIG_PV) )
+        return len;
+
+    return clear_guest_pv(dst, len);
+}
+
+static inline unsigned int __raw_copy_to_guest(void *dst, const void *src,
+                                               unsigned int len)
+{
+    if ( is_hvm_vcpu(current) )
+        return copy_to_user_hvm(dst, src, len);
+
+    if ( !IS_ENABLED(CONFIG_PV) )
+        return len;
+
+    return __copy_to_guest_pv(dst, src, len);
+}
+
+static inline unsigned int __raw_copy_from_guest(void *dst, const void *src,
+                                                 unsigned int len)
+{
+    if ( is_hvm_vcpu(current) )
+        return copy_from_user_hvm(dst, src, len);
+
+    if ( !IS_ENABLED(CONFIG_PV) )
+        return len;
+
+    return __copy_from_guest_pv(dst, src, len);
+}
 
 /*
  * Pre-validate a guest handle.
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 6cda354cc41f..59489cd75af6 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
 obj-$(CONFIG_PV_SHIM) += shim.o
 obj-$(CONFIG_TRACEBUFFER) += trace.o
 obj-y += traps.o
+obj-$(CONFIG_PV) += usercopy.o
 
 obj-bin-y += dom0_build.init.o
 obj-bin-y += gpr_switch.o
+
+# Allows usercopy.c to include itself
+$(obj)/usercopy.o: CFLAGS-y += -iquote .
diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/pv/usercopy.c
similarity index 100%
rename from xen/arch/x86/usercopy.c
rename to xen/arch/x86/pv/usercopy.c
-- 
2.34.1
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Jan Beulich 2 months, 3 weeks ago
On 14.11.2025 15:01, Grygorii Strashko wrote:
> From: Grygorii Strashko <grygorii_strashko@epam.com>
> 
> Xen uses below pattern for raw_x_guest() functions:
> 
> define raw_copy_to_guest(dst, src, len)        \
>     (is_hvm_vcpu(current) ?                     \
>      copy_to_user_hvm((dst), (src), (len)) :    \
>      copy_to_guest_pv(dst, src, len))
> 
> This pattern works depending on CONFIG_PV/CONFIG_HVM as:
> - PV=y and HVM=y
>   Proper guest access function is selected depending on domain type.
> - PV=y and HVM=n
>   Only PV domains are possible. is_hvm_domain/vcpu() will constify to "false"
>   and compiler will optimize code and skip HVM specific part.
> - PV=n and HVM=y
>   Only HVM domains are possible. is_hvm_domain/vcpu() will not be constified.
>   No PV specific code will be optimized by compiler.
> - PV=n and HVM=n
>   No guests should possible. The code will still follow PV path.
> 
> Rework raw_x_guest() code to use static inline functions which account for
> above PV/HVM possible configurations with main intention to optimize code
> for (PV=n and HVM=y) case.
> 
> For the case (PV=n and HVM=n) return "len" value indicating a failure (no
> guests should be possible in this case, which means no access to guest
> memory should ever happen).
> 
> Finally move arch/x86/usercopy.c into arch/x86/pv/usercopy.c to use it only
> with PV=y.
> 
> The measured (bloat-o-meter) improvement for (PV=n and HVM=y) case is:
>   add/remove: 3/8 grow/shrink: 3/89 up/down: 1018/-12087 (-11069)
>   Total: Before=1937280, After=1926211, chg -0.57%
> 
> [teddy.astie@vates.tech: Suggested to use static inline functions vs
> macro combinations]
> Suggested-by: Teddy Astie <teddy.astie@vates.tech>
> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>

I would guess that this R-b would have needed dropping, ...

> ---
> changes in v4:
> - move usercopy.c into arch/x86/pv/
> - rework to always dynamically check for HVM vcpu(domain) by using is_hvm_vcpu()
>   as requested by Jan Beulich

... with at least the latter of these two changes.

> --- a/xen/arch/x86/pv/Makefile
> +++ b/xen/arch/x86/pv/Makefile
> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>  obj-$(CONFIG_PV_SHIM) += shim.o
>  obj-$(CONFIG_TRACEBUFFER) += trace.o
>  obj-y += traps.o
> +obj-$(CONFIG_PV) += usercopy.o

Just obj-y with the movement.

However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
which aren't PV-specific. This may be only a latent issue right now, as we
have only a single use site of copy_from_unsafe(), but those functions need
to remain available. (We may want to arrange for them to be removed when
linking, as long as they're not referenced. But that's a separate topic.)

Jan
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Grygorii Strashko 2 months, 3 weeks ago

On 17.11.25 18:43, Jan Beulich wrote:
> On 14.11.2025 15:01, Grygorii Strashko wrote:
>> From: Grygorii Strashko <grygorii_strashko@epam.com>
>>
>> Xen uses below pattern for raw_x_guest() functions:
>>
>> define raw_copy_to_guest(dst, src, len)        \
>>      (is_hvm_vcpu(current) ?                     \
>>       copy_to_user_hvm((dst), (src), (len)) :    \
>>       copy_to_guest_pv(dst, src, len))
>>
>> This pattern works depending on CONFIG_PV/CONFIG_HVM as:
>> - PV=y and HVM=y
>>    Proper guest access function is selected depending on domain type.
>> - PV=y and HVM=n
>>    Only PV domains are possible. is_hvm_domain/vcpu() will constify to "false"
>>    and compiler will optimize code and skip HVM specific part.
>> - PV=n and HVM=y
>>    Only HVM domains are possible. is_hvm_domain/vcpu() will not be constified.
>>    No PV specific code will be optimized by compiler.
>> - PV=n and HVM=n
>>    No guests should possible. The code will still follow PV path.
>>
>> Rework raw_x_guest() code to use static inline functions which account for
>> above PV/HVM possible configurations with main intention to optimize code
>> for (PV=n and HVM=y) case.
>>
>> For the case (PV=n and HVM=n) return "len" value indicating a failure (no
>> guests should be possible in this case, which means no access to guest
>> memory should ever happen).
>>
>> Finally move arch/x86/usercopy.c into arch/x86/pv/usercopy.c to use it only
>> with PV=y.
>>
>> The measured (bloat-o-meter) improvement for (PV=n and HVM=y) case is:
>>    add/remove: 3/8 grow/shrink: 3/89 up/down: 1018/-12087 (-11069)
>>    Total: Before=1937280, After=1926211, chg -0.57%
>>
>> [teddy.astie@vates.tech: Suggested to use static inline functions vs
>> macro combinations]
>> Suggested-by: Teddy Astie <teddy.astie@vates.tech>
>> Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
>> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
> 
> I would guess that this R-b would have needed dropping, ...
> 
>> ---
>> changes in v4:
>> - move usercopy.c into arch/x86/pv/
>> - rework to always dynamically check for HVM vcpu(domain) by using is_hvm_vcpu()
>>    as requested by Jan Beulich
> 
> ... with at least the latter of these two changes.
> 
>> --- a/xen/arch/x86/pv/Makefile
>> +++ b/xen/arch/x86/pv/Makefile
>> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>>   obj-$(CONFIG_PV_SHIM) += shim.o
>>   obj-$(CONFIG_TRACEBUFFER) += trace.o
>>   obj-y += traps.o
>> +obj-$(CONFIG_PV) += usercopy.o
> 
> Just obj-y with the movement.
> 
> However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
> version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
> which aren't PV-specific. This may be only a latent issue right now, as we
> have only a single use site of copy_from_unsafe(), but those functions need
> to remain available. (We may want to arrange for them to be removed when
> linking, as long as they're not referenced. But that's a separate topic.)

It is confusing that none of build cfg combinations have failed
(HVM=y PV=n, HVM=n PV=n) :(

copy_to_unsafe_ll()
- called from copy_to_unsafe()
- copy_to_unsafe() has no users (unreachable, MISRA 2.1?)

copy_from_unsafe_ll()
- called from copy_from_unsafe()
- copy_from_unsafe() called from one place do_invalid_op() with
   copy_from_unsafe(,, n = sizeof(bug_insn)).
   Due to __builtin_constant_p(n) check the copy_from_unsafe() call
   optimized by compiler to
   get_unsafe_size(*(uint16_t *)to, from, 2, UA_DROP, ret, 2);

as result copy_from_unsafe_ll() is unreachable also (?).

If those function are not subject to be removed, the
  usercopy.c can't be moved in "x86/pv", Right?

Making copy_{from,to}_unsafe_ll() available for !PV means
rewriting usercopy.c in some way, Right?

-- 
Best regards,
-grygorii
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Jan Beulich 2 months, 3 weeks ago
On 18.11.2025 14:08, Grygorii Strashko wrote:
> On 17.11.25 18:43, Jan Beulich wrote:
>> On 14.11.2025 15:01, Grygorii Strashko wrote:
>>> --- a/xen/arch/x86/pv/Makefile
>>> +++ b/xen/arch/x86/pv/Makefile
>>> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>>>   obj-$(CONFIG_PV_SHIM) += shim.o
>>>   obj-$(CONFIG_TRACEBUFFER) += trace.o
>>>   obj-y += traps.o
>>> +obj-$(CONFIG_PV) += usercopy.o
>>
>> Just obj-y with the movement.
>>
>> However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
>> version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
>> which aren't PV-specific. This may be only a latent issue right now, as we
>> have only a single use site of copy_from_unsafe(), but those functions need
>> to remain available. (We may want to arrange for them to be removed when
>> linking, as long as they're not referenced. But that's a separate topic.)
> 
> It is confusing that none of build cfg combinations have failed
> (HVM=y PV=n, HVM=n PV=n) :(
> 
> copy_to_unsafe_ll()
> - called from copy_to_unsafe()
> - copy_to_unsafe() has no users (unreachable, MISRA 2.1?)
> 
> copy_from_unsafe_ll()
> - called from copy_from_unsafe()
> - copy_from_unsafe() called from one place do_invalid_op() with
>    copy_from_unsafe(,, n = sizeof(bug_insn)).
>    Due to __builtin_constant_p(n) check the copy_from_unsafe() call
>    optimized by compiler to
>    get_unsafe_size(*(uint16_t *)to, from, 2, UA_DROP, ret, 2);
> 
> as result copy_from_unsafe_ll() is unreachable also (?).

Yes, these likely all want to become library-like, so they are linked in only
when actually referenced.

> If those function are not subject to be removed, the
>   usercopy.c can't be moved in "x86/pv", Right?

That's my take, yes.

> Making copy_{from,to}_unsafe_ll() available for !PV means
> rewriting usercopy.c in some way, Right?

"Re-writing" is probably too much, but some adjustments would be needed if
you want to keep the "unsafe" functions but compile out the "guest" ones.
It may be possible to compile the file twice, once from x86/pv/ and once
from x86/, replacing the self-#include near the bottom of the file. The
former would then produce the "guest" functions, the latter the "unsafe"
ones.

Jan
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Grygorii Strashko 2 months, 3 weeks ago
Hi Jan

On 18.11.25 15:45, Jan Beulich wrote:
> On 18.11.2025 14:08, Grygorii Strashko wrote:
>> On 17.11.25 18:43, Jan Beulich wrote:
>>> On 14.11.2025 15:01, Grygorii Strashko wrote:
>>>> --- a/xen/arch/x86/pv/Makefile
>>>> +++ b/xen/arch/x86/pv/Makefile
>>>> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>>>>    obj-$(CONFIG_PV_SHIM) += shim.o
>>>>    obj-$(CONFIG_TRACEBUFFER) += trace.o
>>>>    obj-y += traps.o
>>>> +obj-$(CONFIG_PV) += usercopy.o
>>>
>>> Just obj-y with the movement.
>>>
>>> However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
>>> version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
>>> which aren't PV-specific. This may be only a latent issue right now, as we
>>> have only a single use site of copy_from_unsafe(), but those functions need
>>> to remain available. (We may want to arrange for them to be removed when
>>> linking, as long as they're not referenced. But that's a separate topic.)
>>
>> It is confusing that none of build cfg combinations have failed
>> (HVM=y PV=n, HVM=n PV=n) :(
>>
>> copy_to_unsafe_ll()
>> - called from copy_to_unsafe()
>> - copy_to_unsafe() has no users (unreachable, MISRA 2.1?)
>>
>> copy_from_unsafe_ll()
>> - called from copy_from_unsafe()
>> - copy_from_unsafe() called from one place do_invalid_op() with
>>     copy_from_unsafe(,, n = sizeof(bug_insn)).
>>     Due to __builtin_constant_p(n) check the copy_from_unsafe() call
>>     optimized by compiler to
>>     get_unsafe_size(*(uint16_t *)to, from, 2, UA_DROP, ret, 2);
>>
>> as result copy_from_unsafe_ll() is unreachable also (?).
> 
> Yes, these likely all want to become library-like, so they are linked in only
> when actually referenced.
> 
>> If those function are not subject to be removed, the
>>    usercopy.c can't be moved in "x86/pv", Right?
> 
> That's my take, yes.
> 
>> Making copy_{from,to}_unsafe_ll() available for !PV means
>> rewriting usercopy.c in some way, Right?
> 
> "Re-writing" is probably too much, but some adjustments would be needed if
> you want to keep the "unsafe" functions but compile out the "guest" ones.
> It may be possible to compile the file twice, once from x86/pv/ and once
> from x86/, replacing the self-#include near the bottom of the file. The
> former would then produce the "guest" functions, the latter the "unsafe"
> ones.

Below is the difference I came up with, will it work?

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 6e2b17471719..a2017b4600b3 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -71,6 +71,7 @@ obj-y += time.o
  obj-y += traps-setup.o
  obj-y += traps.o
  obj-$(CONFIG_INTEL) += tsx.o
+obj-y += usercopy.o
  obj-y += x86_emulate.o
  obj-$(CONFIG_TBOOT) += tboot.o
  obj-y += hpet.o
diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 59489cd75af6..1fddfac8303e 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -14,10 +14,7 @@ obj-y += ro-page-fault.o
  obj-$(CONFIG_PV_SHIM) += shim.o
  obj-$(CONFIG_TRACEBUFFER) += trace.o
  obj-y += traps.o
-obj-$(CONFIG_PV) += usercopy.o
+obj-y += usercopy.o
  
  obj-bin-y += dom0_build.init.o
  obj-bin-y += gpr_switch.o
-
-# Allows usercopy.c to include itself
-$(obj)/usercopy.o: CFLAGS-y += -iquote .
diff --git a/xen/arch/x86/pv/usercopy.c b/xen/arch/x86/pv/usercopy.c
index a24b52cc66c1..6ca6eca5d818 100644
--- a/xen/arch/x86/pv/usercopy.c
+++ b/xen/arch/x86/pv/usercopy.c
@@ -64,8 +64,6 @@ unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int
      return n;
  }
  
-#if GUARD(1) + 0
-
  /**
   * copy_to_guest_pv: - Copy a block of data into PV guest space.
   * @to:   Destination address, in PV guest space.
@@ -139,16 +137,6 @@ unsigned int copy_from_guest_pv(void *to, const void __user *from,
      return n;
  }
  
-# undef GUARD
-# define GUARD UA_DROP
-# define copy_to_guest_ll copy_to_unsafe_ll
-# define copy_from_guest_ll copy_from_unsafe_ll
-# undef __user
-# define __user
-# include __FILE__
-
-#endif /* GUARD(1) */
-
  /*
   * Local variables:
   * mode: C
diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/usercopy.c
new file mode 100644
index 000000000000..9c12eda64181
--- /dev/null
+++ b/xen/arch/x86/usercopy.c
@@ -0,0 +1,77 @@
+/*
+ * User address space access functions.
+ *
+ * Copyright 1997 Andi Kleen <ak@muc.de>
+ * Copyright 1997 Linus Torvalds
+ * Copyright 2002 Andi Kleen <ak@suse.de>
+ */
+
+#include <xen/lib.h>
+#include <xen/sched.h>
+#include <asm/uaccess.h>
+
+# define GUARD UA_DROP
+# define copy_to_guest_ll copy_to_unsafe_ll
+# define copy_from_guest_ll copy_from_unsafe_ll
+# undef __user
+# define __user
+
+unsigned int copy_to_guest_ll(void __user *to, const void *from, unsigned int n)
+{
+    GUARD(unsigned dummy);
+
+    stac();
+    asm_inline volatile (
+        GUARD(
+        "    guest_access_mask_ptr %[to], %q[scratch1], %q[scratch2]\n"
+        )
+        "1:  rep movsb\n"
+        "2:\n"
+        _ASM_EXTABLE(1b, 2b)
+        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from)
+          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
+        :: "memory" );
+    clac();
+
+    return n;
+}
+
+unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int n)
+{
+    unsigned dummy;
+
+    stac();
+    asm_inline volatile (
+        GUARD(
+        "    guest_access_mask_ptr %[from], %q[scratch1], %q[scratch2]\n"
+        )
+        "1:  rep movsb\n"
+        "2:\n"
+        ".section .fixup,\"ax\"\n"
+        "6:  mov  %[cnt], %k[from]\n"
+        "    xchg %%eax, %[aux]\n"
+        "    xor  %%eax, %%eax\n"
+        "    rep stosb\n"
+        "    xchg %[aux], %%eax\n"
+        "    mov  %k[from], %[cnt]\n"
+        "    jmp 2b\n"
+        ".previous\n"
+        _ASM_EXTABLE(1b, 6b)
+        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from),
+          [aux] "=&r" (dummy)
+          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
+        :: "memory" );
+    clac();
+
+    return n;
+}
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Jan Beulich 2 months, 3 weeks ago
On 19.11.2025 20:36, Grygorii Strashko wrote:
> Hi Jan
> 
> On 18.11.25 15:45, Jan Beulich wrote:
>> On 18.11.2025 14:08, Grygorii Strashko wrote:
>>> On 17.11.25 18:43, Jan Beulich wrote:
>>>> On 14.11.2025 15:01, Grygorii Strashko wrote:
>>>>> --- a/xen/arch/x86/pv/Makefile
>>>>> +++ b/xen/arch/x86/pv/Makefile
>>>>> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>>>>>    obj-$(CONFIG_PV_SHIM) += shim.o
>>>>>    obj-$(CONFIG_TRACEBUFFER) += trace.o
>>>>>    obj-y += traps.o
>>>>> +obj-$(CONFIG_PV) += usercopy.o
>>>>
>>>> Just obj-y with the movement.
>>>>
>>>> However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
>>>> version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
>>>> which aren't PV-specific. This may be only a latent issue right now, as we
>>>> have only a single use site of copy_from_unsafe(), but those functions need
>>>> to remain available. (We may want to arrange for them to be removed when
>>>> linking, as long as they're not referenced. But that's a separate topic.)
>>>
>>> It is confusing that none of build cfg combinations have failed
>>> (HVM=y PV=n, HVM=n PV=n) :(
>>>
>>> copy_to_unsafe_ll()
>>> - called from copy_to_unsafe()
>>> - copy_to_unsafe() has no users (unreachable, MISRA 2.1?)
>>>
>>> copy_from_unsafe_ll()
>>> - called from copy_from_unsafe()
>>> - copy_from_unsafe() called from one place do_invalid_op() with
>>>     copy_from_unsafe(,, n = sizeof(bug_insn)).
>>>     Due to __builtin_constant_p(n) check the copy_from_unsafe() call
>>>     optimized by compiler to
>>>     get_unsafe_size(*(uint16_t *)to, from, 2, UA_DROP, ret, 2);
>>>
>>> as result copy_from_unsafe_ll() is unreachable also (?).
>>
>> Yes, these likely all want to become library-like, so they are linked in only
>> when actually referenced.
>>
>>> If those function are not subject to be removed, the
>>>    usercopy.c can't be moved in "x86/pv", Right?
>>
>> That's my take, yes.
>>
>>> Making copy_{from,to}_unsafe_ll() available for !PV means
>>> rewriting usercopy.c in some way, Right?
>>
>> "Re-writing" is probably too much, but some adjustments would be needed if
>> you want to keep the "unsafe" functions but compile out the "guest" ones.
>> It may be possible to compile the file twice, once from x86/pv/ and once
>> from x86/, replacing the self-#include near the bottom of the file. The
>> former would then produce the "guest" functions, the latter the "unsafe"
>> ones.
> 
> Below is the difference I came up with, will it work?

That'll be on you to make sure, but ...

> --- /dev/null
> +++ b/xen/arch/x86/usercopy.c
> @@ -0,0 +1,77 @@
> +/*
> + * User address space access functions.
> + *
> + * Copyright 1997 Andi Kleen <ak@muc.de>
> + * Copyright 1997 Linus Torvalds
> + * Copyright 2002 Andi Kleen <ak@suse.de>
> + */
> +
> +#include <xen/lib.h>
> +#include <xen/sched.h>
> +#include <asm/uaccess.h>
> +
> +# define GUARD UA_DROP
> +# define copy_to_guest_ll copy_to_unsafe_ll
> +# define copy_from_guest_ll copy_from_unsafe_ll
> +# undef __user
> +# define __user
> +
> +unsigned int copy_to_guest_ll(void __user *to, const void *from, unsigned int n)
> +{
> +    GUARD(unsigned dummy);
> +
> +    stac();
> +    asm_inline volatile (
> +        GUARD(
> +        "    guest_access_mask_ptr %[to], %q[scratch1], %q[scratch2]\n"
> +        )
> +        "1:  rep movsb\n"
> +        "2:\n"
> +        _ASM_EXTABLE(1b, 2b)
> +        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from)
> +          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
> +        :: "memory" );
> +    clac();
> +
> +    return n;
> +}
> +
> +unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int n)
> +{
> +    unsigned dummy;
> +
> +    stac();
> +    asm_inline volatile (
> +        GUARD(
> +        "    guest_access_mask_ptr %[from], %q[scratch1], %q[scratch2]\n"
> +        )
> +        "1:  rep movsb\n"
> +        "2:\n"
> +        ".section .fixup,\"ax\"\n"
> +        "6:  mov  %[cnt], %k[from]\n"
> +        "    xchg %%eax, %[aux]\n"
> +        "    xor  %%eax, %%eax\n"
> +        "    rep stosb\n"
> +        "    xchg %[aux], %%eax\n"
> +        "    mov  %k[from], %[cnt]\n"
> +        "    jmp 2b\n"
> +        ".previous\n"
> +        _ASM_EXTABLE(1b, 6b)
> +        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from),
> +          [aux] "=&r" (dummy)
> +          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
> +        :: "memory" );
> +    clac();
> +
> +    return n;
> +}

... we don't want to have two instances of that code in the code base. One file
should #include the other, after putting in place suitable #define-s. Which
direction the #include should work I'm not entirely certain:
- pv/usercopy.c including usercopy.c means usercopy.c then ends up misnamed,
- usercopy.c including pv/usercopy.c means a "common" file includes a more
  specialized (PV-only) one.

Jan
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Jan Beulich 2 months, 2 weeks ago
On 20.11.2025 09:11, Jan Beulich wrote:
> On 19.11.2025 20:36, Grygorii Strashko wrote:
>> Hi Jan
>>
>> On 18.11.25 15:45, Jan Beulich wrote:
>>> On 18.11.2025 14:08, Grygorii Strashko wrote:
>>>> On 17.11.25 18:43, Jan Beulich wrote:
>>>>> On 14.11.2025 15:01, Grygorii Strashko wrote:
>>>>>> --- a/xen/arch/x86/pv/Makefile
>>>>>> +++ b/xen/arch/x86/pv/Makefile
>>>>>> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>>>>>>    obj-$(CONFIG_PV_SHIM) += shim.o
>>>>>>    obj-$(CONFIG_TRACEBUFFER) += trace.o
>>>>>>    obj-y += traps.o
>>>>>> +obj-$(CONFIG_PV) += usercopy.o
>>>>>
>>>>> Just obj-y with the movement.
>>>>>
>>>>> However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
>>>>> version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
>>>>> which aren't PV-specific. This may be only a latent issue right now, as we
>>>>> have only a single use site of copy_from_unsafe(), but those functions need
>>>>> to remain available. (We may want to arrange for them to be removed when
>>>>> linking, as long as they're not referenced. But that's a separate topic.)
>>>>
>>>> It is confusing that none of build cfg combinations have failed
>>>> (HVM=y PV=n, HVM=n PV=n) :(
>>>>
>>>> copy_to_unsafe_ll()
>>>> - called from copy_to_unsafe()
>>>> - copy_to_unsafe() has no users (unreachable, MISRA 2.1?)
>>>>
>>>> copy_from_unsafe_ll()
>>>> - called from copy_from_unsafe()
>>>> - copy_from_unsafe() called from one place do_invalid_op() with
>>>>     copy_from_unsafe(,, n = sizeof(bug_insn)).
>>>>     Due to __builtin_constant_p(n) check the copy_from_unsafe() call
>>>>     optimized by compiler to
>>>>     get_unsafe_size(*(uint16_t *)to, from, 2, UA_DROP, ret, 2);
>>>>
>>>> as result copy_from_unsafe_ll() is unreachable also (?).
>>>
>>> Yes, these likely all want to become library-like, so they are linked in only
>>> when actually referenced.
>>>
>>>> If those function are not subject to be removed, the
>>>>    usercopy.c can't be moved in "x86/pv", Right?
>>>
>>> That's my take, yes.
>>>
>>>> Making copy_{from,to}_unsafe_ll() available for !PV means
>>>> rewriting usercopy.c in some way, Right?
>>>
>>> "Re-writing" is probably too much, but some adjustments would be needed if
>>> you want to keep the "unsafe" functions but compile out the "guest" ones.
>>> It may be possible to compile the file twice, once from x86/pv/ and once
>>> from x86/, replacing the self-#include near the bottom of the file. The
>>> former would then produce the "guest" functions, the latter the "unsafe"
>>> ones.
>>
>> Below is the difference I came up with, will it work?
> 
> That'll be on you to make sure, but ...
> 
>> --- /dev/null
>> +++ b/xen/arch/x86/usercopy.c
>> @@ -0,0 +1,77 @@
>> +/*
>> + * User address space access functions.
>> + *
>> + * Copyright 1997 Andi Kleen <ak@muc.de>
>> + * Copyright 1997 Linus Torvalds
>> + * Copyright 2002 Andi Kleen <ak@suse.de>
>> + */
>> +
>> +#include <xen/lib.h>
>> +#include <xen/sched.h>
>> +#include <asm/uaccess.h>
>> +
>> +# define GUARD UA_DROP
>> +# define copy_to_guest_ll copy_to_unsafe_ll
>> +# define copy_from_guest_ll copy_from_unsafe_ll
>> +# undef __user
>> +# define __user
>> +
>> +unsigned int copy_to_guest_ll(void __user *to, const void *from, unsigned int n)
>> +{
>> +    GUARD(unsigned dummy);
>> +
>> +    stac();
>> +    asm_inline volatile (
>> +        GUARD(
>> +        "    guest_access_mask_ptr %[to], %q[scratch1], %q[scratch2]\n"
>> +        )
>> +        "1:  rep movsb\n"
>> +        "2:\n"
>> +        _ASM_EXTABLE(1b, 2b)
>> +        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from)
>> +          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
>> +        :: "memory" );
>> +    clac();
>> +
>> +    return n;
>> +}
>> +
>> +unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int n)
>> +{
>> +    unsigned dummy;
>> +
>> +    stac();
>> +    asm_inline volatile (
>> +        GUARD(
>> +        "    guest_access_mask_ptr %[from], %q[scratch1], %q[scratch2]\n"
>> +        )
>> +        "1:  rep movsb\n"
>> +        "2:\n"
>> +        ".section .fixup,\"ax\"\n"
>> +        "6:  mov  %[cnt], %k[from]\n"
>> +        "    xchg %%eax, %[aux]\n"
>> +        "    xor  %%eax, %%eax\n"
>> +        "    rep stosb\n"
>> +        "    xchg %[aux], %%eax\n"
>> +        "    mov  %k[from], %[cnt]\n"
>> +        "    jmp 2b\n"
>> +        ".previous\n"
>> +        _ASM_EXTABLE(1b, 6b)
>> +        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from),
>> +          [aux] "=&r" (dummy)
>> +          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
>> +        :: "memory" );
>> +    clac();
>> +
>> +    return n;
>> +}
> 
> ... we don't want to have two instances of that code in the code base. One file
> should #include the other, after putting in place suitable #define-s. Which
> direction the #include should work I'm not entirely certain:
> - pv/usercopy.c including usercopy.c means usercopy.c then ends up misnamed,
> - usercopy.c including pv/usercopy.c means a "common" file includes a more
>   specialized (PV-only) one.

Likely it would be best to build this into library members (ideally one per
function), such that unused ones wouldn't even be pulled in by the linker. I
meant to suggest to move to xen/arch/x86/lib/, but right now we only have
xen/lib/x86/, where I don't think this would be a good fit. I've brought this
up with the other x86 maintainers ...

Jan
Re: [XEN][PATCH v4] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations
Posted by Grygorii Strashko 2 months, 1 week ago
Hi Jan,

On 24.11.25 12:08, Jan Beulich wrote:
> On 20.11.2025 09:11, Jan Beulich wrote:
>> On 19.11.2025 20:36, Grygorii Strashko wrote:
>>> Hi Jan
>>>
>>> On 18.11.25 15:45, Jan Beulich wrote:
>>>> On 18.11.2025 14:08, Grygorii Strashko wrote:
>>>>> On 17.11.25 18:43, Jan Beulich wrote:
>>>>>> On 14.11.2025 15:01, Grygorii Strashko wrote:
>>>>>>> --- a/xen/arch/x86/pv/Makefile
>>>>>>> +++ b/xen/arch/x86/pv/Makefile
>>>>>>> @@ -14,6 +14,10 @@ obj-y += ro-page-fault.o
>>>>>>>     obj-$(CONFIG_PV_SHIM) += shim.o
>>>>>>>     obj-$(CONFIG_TRACEBUFFER) += trace.o
>>>>>>>     obj-y += traps.o
>>>>>>> +obj-$(CONFIG_PV) += usercopy.o
>>>>>>
>>>>>> Just obj-y with the movement.
>>>>>>
>>>>>> However, is the movement (and was the adding of $(CONFIG_PV) in the earlier
>>>>>> version) actually correct? The file also produces copy_{from,to}_unsafe_ll(),
>>>>>> which aren't PV-specific. This may be only a latent issue right now, as we
>>>>>> have only a single use site of copy_from_unsafe(), but those functions need
>>>>>> to remain available. (We may want to arrange for them to be removed when
>>>>>> linking, as long as they're not referenced. But that's a separate topic.)
>>>>>
>>>>> It is confusing that none of build cfg combinations have failed
>>>>> (HVM=y PV=n, HVM=n PV=n) :(
>>>>>
>>>>> copy_to_unsafe_ll()
>>>>> - called from copy_to_unsafe()
>>>>> - copy_to_unsafe() has no users (unreachable, MISRA 2.1?)
>>>>>
>>>>> copy_from_unsafe_ll()
>>>>> - called from copy_from_unsafe()
>>>>> - copy_from_unsafe() called from one place do_invalid_op() with
>>>>>      copy_from_unsafe(,, n = sizeof(bug_insn)).
>>>>>      Due to __builtin_constant_p(n) check the copy_from_unsafe() call
>>>>>      optimized by compiler to
>>>>>      get_unsafe_size(*(uint16_t *)to, from, 2, UA_DROP, ret, 2);
>>>>>
>>>>> as result copy_from_unsafe_ll() is unreachable also (?).
>>>>
>>>> Yes, these likely all want to become library-like, so they are linked in only
>>>> when actually referenced.
>>>>
>>>>> If those function are not subject to be removed, the
>>>>>     usercopy.c can't be moved in "x86/pv", Right?
>>>>
>>>> That's my take, yes.
>>>>
>>>>> Making copy_{from,to}_unsafe_ll() available for !PV means
>>>>> rewriting usercopy.c in some way, Right?
>>>>
>>>> "Re-writing" is probably too much, but some adjustments would be needed if
>>>> you want to keep the "unsafe" functions but compile out the "guest" ones.
>>>> It may be possible to compile the file twice, once from x86/pv/ and once
>>>> from x86/, replacing the self-#include near the bottom of the file. The
>>>> former would then produce the "guest" functions, the latter the "unsafe"
>>>> ones.
>>>
>>> Below is the difference I came up with, will it work?
>>
>> That'll be on you to make sure, but ...
>>
>>> --- /dev/null
>>> +++ b/xen/arch/x86/usercopy.c
>>> @@ -0,0 +1,77 @@
>>> +/*
>>> + * User address space access functions.
>>> + *
>>> + * Copyright 1997 Andi Kleen <ak@muc.de>
>>> + * Copyright 1997 Linus Torvalds
>>> + * Copyright 2002 Andi Kleen <ak@suse.de>
>>> + */
>>> +
>>> +#include <xen/lib.h>
>>> +#include <xen/sched.h>
>>> +#include <asm/uaccess.h>
>>> +
>>> +# define GUARD UA_DROP
>>> +# define copy_to_guest_ll copy_to_unsafe_ll
>>> +# define copy_from_guest_ll copy_from_unsafe_ll
>>> +# undef __user
>>> +# define __user
>>> +
>>> +unsigned int copy_to_guest_ll(void __user *to, const void *from, unsigned int n)
>>> +{
>>> +    GUARD(unsigned dummy);
>>> +
>>> +    stac();
>>> +    asm_inline volatile (
>>> +        GUARD(
>>> +        "    guest_access_mask_ptr %[to], %q[scratch1], %q[scratch2]\n"
>>> +        )
>>> +        "1:  rep movsb\n"
>>> +        "2:\n"
>>> +        _ASM_EXTABLE(1b, 2b)
>>> +        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from)
>>> +          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
>>> +        :: "memory" );
>>> +    clac();
>>> +
>>> +    return n;
>>> +}
>>> +
>>> +unsigned int copy_from_guest_ll(void *to, const void __user *from, unsigned int n)
>>> +{
>>> +    unsigned dummy;
>>> +
>>> +    stac();
>>> +    asm_inline volatile (
>>> +        GUARD(
>>> +        "    guest_access_mask_ptr %[from], %q[scratch1], %q[scratch2]\n"
>>> +        )
>>> +        "1:  rep movsb\n"
>>> +        "2:\n"
>>> +        ".section .fixup,\"ax\"\n"
>>> +        "6:  mov  %[cnt], %k[from]\n"
>>> +        "    xchg %%eax, %[aux]\n"
>>> +        "    xor  %%eax, %%eax\n"
>>> +        "    rep stosb\n"
>>> +        "    xchg %[aux], %%eax\n"
>>> +        "    mov  %k[from], %[cnt]\n"
>>> +        "    jmp 2b\n"
>>> +        ".previous\n"
>>> +        _ASM_EXTABLE(1b, 6b)
>>> +        : [cnt] "+c" (n), [to] "+D" (to), [from] "+S" (from),
>>> +          [aux] "=&r" (dummy)
>>> +          GUARD(, [scratch1] "=&r" (dummy), [scratch2] "=&r" (dummy))
>>> +        :: "memory" );
>>> +    clac();
>>> +
>>> +    return n;
>>> +}
>>
>> ... we don't want to have two instances of that code in the code base. One file
>> should #include the other, after putting in place suitable #define-s. Which
>> direction the #include should work I'm not entirely certain:
>> - pv/usercopy.c including usercopy.c means usercopy.c then ends up misnamed,
>> - usercopy.c including pv/usercopy.c means a "common" file includes a more
>>    specialized (PV-only) one.
> 
> Likely it would be best to build this into library members (ideally one per
> function), such that unused ones wouldn't even be pulled in by the linker. I
> meant to suggest to move to xen/arch/x86/lib/, but right now we only have
> xen/lib/x86/, where I don't think this would be a good fit. I've brought this
> up with the other x86 maintainers ...

I've seen your "x86 lib" patches. Hope they will move forward.
I'd be happy to resend on top of them.

-- 
Best regards,
-grygorii