[PATCH] x86/pv: Rework TRY_LOAD_SEG() to use asm goto()

Andrew Cooper posted 1 patch 3 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20250718202548.2834921-1-andrew.cooper3@citrix.com
xen/arch/x86/domain.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
[PATCH] x86/pv: Rework TRY_LOAD_SEG() to use asm goto()
Posted by Andrew Cooper 3 months, 1 week ago
This moves the exception path to being out-of-line within the function, rather
than in the .fixup section, which improves backtraces.

Because the macro is used multiple times, the fault label needs declaring as
local.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

Slightly RFC.  I haven't checked if Eclair will be happy with __label__ yet.

It is disappointing that, unless we retain the xor/mov for the exception path,
GCC decides to emit worse code, notably duplicating the mov %ds success path
in mov %es's error path.

The "+r" constraint was actually wrong before; the asm only produces
all_segs_okay and does not consume it.  Given leeway, GCC decides to manifest
$1 in a different register on each error path and OR them together (inverted,
I'm guessing) to reconstitute all_segs_okay.

Still, we've got rid of the manual jmp...
---
 xen/arch/x86/domain.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 56c381618712..d795e5b968e2 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1738,17 +1738,22 @@ static void load_segments(struct vcpu *n)
      * @all_segs_okay in function scope, and load NUL into @sel.
      */
 #define TRY_LOAD_SEG(seg, val)                          \
-    asm_inline volatile (                               \
-        "1: mov %k[_val], %%" #seg "\n\t"               \
-        "2:\n\t"                                        \
-        ".section .fixup, \"ax\"\n\t"                   \
-        "3: xor %k[ok], %k[ok]\n\t"                     \
-        "   mov %k[ok], %%" #seg "\n\t"                 \
-        "   jmp 2b\n\t"                                 \
-        ".previous\n\t"                                 \
-        _ASM_EXTABLE(1b, 3b)                            \
-        : [ok] "+r" (all_segs_okay)                     \
-        : [_val] "rm" (val) )
+    ({                                                  \
+        __label__ fault;                                \
+        asm_inline volatile goto (                      \
+            "1: mov %k[_val], %%" #seg "\n\t"           \
+            _ASM_EXTABLE(1b, %l[fault])                 \
+            :: [_val] "rm" (val)                        \
+            :: fault );                                 \
+        if ( 0 )                                        \
+        {                                               \
+        fault: __attribute__((cold));                   \
+            asm_inline volatile (                       \
+                "xor %k[ok], %k[ok]\n\t"                \
+                "mov %k[ok], %%" #seg                   \
+                : [ok] "=r" (all_segs_okay) );          \
+        }                                               \
+    })
 
     if ( !compat )
     {
-- 
2.39.5


Re: [PATCH] x86/pv: Rework TRY_LOAD_SEG() to use asm goto()
Posted by Jan Beulich 3 months, 1 week ago
On 18.07.2025 22:25, Andrew Cooper wrote:
> This moves the exception path to being out-of-line within the function, rather
> than in the .fixup section, which improves backtraces.
> 
> Because the macro is used multiple times, the fault label needs declaring as
> local.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> 
> Slightly RFC.  I haven't checked if Eclair will be happy with __label__ yet.

Even if it is, I guess you'd need to update the list of extensions we
use (docs/misra/C-language-toolchain.rst)?

> It is disappointing that, unless we retain the xor/mov for the exception path,
> GCC decides to emit worse code, notably duplicating the mov %ds success path
> in mov %es's error path.

Is it the pair of XOR/MOV, or merely the MOV (in which case it might be
nice to try omitting at least the XOR)? Yet then the dual purpose of the
zero is likely getting in the way anyway.

> The "+r" constraint was actually wrong before; the asm only produces
> all_segs_okay and does not consume it.

Yet it only conditionally set it in the old construct. That still needs
expressing with "+r", or else the variable's earlier setting could all
be eliminated. In the new construct using "=r" is okay.

> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1738,17 +1738,22 @@ static void load_segments(struct vcpu *n)
>       * @all_segs_okay in function scope, and load NUL into @sel.
>       */
>  #define TRY_LOAD_SEG(seg, val)                          \
> -    asm_inline volatile (                               \
> -        "1: mov %k[_val], %%" #seg "\n\t"               \
> -        "2:\n\t"                                        \
> -        ".section .fixup, \"ax\"\n\t"                   \
> -        "3: xor %k[ok], %k[ok]\n\t"                     \
> -        "   mov %k[ok], %%" #seg "\n\t"                 \
> -        "   jmp 2b\n\t"                                 \
> -        ".previous\n\t"                                 \
> -        _ASM_EXTABLE(1b, 3b)                            \
> -        : [ok] "+r" (all_segs_okay)                     \
> -        : [_val] "rm" (val) )
> +    ({                                                  \
> +        __label__ fault;                                \
> +        asm_inline volatile goto (                      \
> +            "1: mov %k[_val], %%" #seg "\n\t"           \
> +            _ASM_EXTABLE(1b, %l[fault])                 \
> +            :: [_val] "rm" (val)                        \

Thoughts on replacing "_val" by "sel" on this occasion?

> +            :: fault );                                 \
> +        if ( 0 )                                        \
> +        {                                               \
> +        fault: __attribute__((cold));                   \
> +            asm_inline volatile (                       \
> +                "xor %k[ok], %k[ok]\n\t"                \
> +                "mov %k[ok], %%" #seg                   \
> +                : [ok] "=r" (all_segs_okay) );          \

Purely formally I think you need "=&r" here now.

Jan

Re: [PATCH] x86/pv: Rework TRY_LOAD_SEG() to use asm goto()
Posted by Nicola Vetrini 3 months, 1 week ago
On 2025-07-21 08:41, Jan Beulich wrote:
> On 18.07.2025 22:25, Andrew Cooper wrote:
>> This moves the exception path to being out-of-line within the 
>> function, rather
>> than in the .fixup section, which improves backtraces.
>> 
>> Because the macro is used multiple times, the fault label needs 
>> declaring as
>> local.
>> 
>> No functional change.
>> 
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> 
>> Slightly RFC.  I haven't checked if Eclair will be happy with 
>> __label__ yet.
> 
> Even if it is, I guess you'd need to update the list of extensions we
> use (docs/misra/C-language-toolchain.rst)?
> 

Only for using the __label__ token in 
automation/eclair_analysis/ECLAIR/toolchain.ecl. The extension itself is 
already documented in 5590c7e6590d ("eclair: allow and document use of 
GCC extension for label addresses")

>> It is disappointing that, unless we retain the xor/mov for the 
>> exception path,
>> GCC decides to emit worse code, notably duplicating the mov %ds 
>> success path
>> in mov %es's error path.
> 
> Is it the pair of XOR/MOV, or merely the MOV (in which case it might be
> nice to try omitting at least the XOR)? Yet then the dual purpose of 
> the
> zero is likely getting in the way anyway.
> 
>> The "+r" constraint was actually wrong before; the asm only produces
>> all_segs_okay and does not consume it.
> 
> Yet it only conditionally set it in the old construct. That still needs
> expressing with "+r", or else the variable's earlier setting could all
> be eliminated. In the new construct using "=r" is okay.
> 
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1738,17 +1738,22 @@ static void load_segments(struct vcpu *n)
>>       * @all_segs_okay in function scope, and load NUL into @sel.
>>       */
>>  #define TRY_LOAD_SEG(seg, val)                          \
>> -    asm_inline volatile (                               \
>> -        "1: mov %k[_val], %%" #seg "\n\t"               \
>> -        "2:\n\t"                                        \
>> -        ".section .fixup, \"ax\"\n\t"                   \
>> -        "3: xor %k[ok], %k[ok]\n\t"                     \
>> -        "   mov %k[ok], %%" #seg "\n\t"                 \
>> -        "   jmp 2b\n\t"                                 \
>> -        ".previous\n\t"                                 \
>> -        _ASM_EXTABLE(1b, 3b)                            \
>> -        : [ok] "+r" (all_segs_okay)                     \
>> -        : [_val] "rm" (val) )
>> +    ({                                                  \
>> +        __label__ fault;                                \
>> +        asm_inline volatile goto (                      \
>> +            "1: mov %k[_val], %%" #seg "\n\t"           \
>> +            _ASM_EXTABLE(1b, %l[fault])                 \
>> +            :: [_val] "rm" (val)                        \
> 
> Thoughts on replacing "_val" by "sel" on this occasion?
> 
>> +            :: fault );                                 \
>> +        if ( 0 )                                        \
>> +        {                                               \
>> +        fault: __attribute__((cold));                   \
>> +            asm_inline volatile (                       \
>> +                "xor %k[ok], %k[ok]\n\t"                \
>> +                "mov %k[ok], %%" #seg                   \
>> +                : [ok] "=r" (all_segs_okay) );          \
> 
> Purely formally I think you need "=&r" here now.
> 
> Jan

-- 
Nicola Vetrini, B.Sc.
Software Engineer
BUGSENG (https://bugseng.com)
LinkedIn: https://www.linkedin.com/in/nicola-vetrini-a42471253

Re: [PATCH] x86/pv: Rework TRY_LOAD_SEG() to use asm goto()
Posted by Jan Beulich 3 months, 1 week ago
On 21.07.2025 10:16, Nicola Vetrini wrote:
> On 2025-07-21 08:41, Jan Beulich wrote:
>> On 18.07.2025 22:25, Andrew Cooper wrote:
>>> This moves the exception path to being out-of-line within the 
>>> function, rather
>>> than in the .fixup section, which improves backtraces.
>>>
>>> Because the macro is used multiple times, the fault label needs 
>>> declaring as
>>> local.
>>>
>>> No functional change.
>>>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Jan Beulich <JBeulich@suse.com>
>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>>
>>> Slightly RFC.  I haven't checked if Eclair will be happy with 
>>> __label__ yet.
>>
>> Even if it is, I guess you'd need to update the list of extensions we
>> use (docs/misra/C-language-toolchain.rst)?
> 
> Only for using the __label__ token in 
> automation/eclair_analysis/ECLAIR/toolchain.ecl. The extension itself is 
> already documented in 5590c7e6590d ("eclair: allow and document use of 
> GCC extension for label addresses")

Except that it's not the address taking that is the point in question here.
We have meanwhile gained a number of asm-goto (and for the uses there I'm
not even sure they count as "address taking"). It's really the __label__
extended keyword (and the thus possible declaration of a scope-restricted
label) that my remark was about. But yes, toolchain.ecl looks to need a
change, too.

Jan

Re: [PATCH] x86/pv: Rework TRY_LOAD_SEG() to use asm goto()
Posted by Nicola Vetrini 3 months, 1 week ago
On 2025-07-21 11:25, Jan Beulich wrote:
> On 21.07.2025 10:16, Nicola Vetrini wrote:
>> On 2025-07-21 08:41, Jan Beulich wrote:
>>> On 18.07.2025 22:25, Andrew Cooper wrote:
>>>> This moves the exception path to being out-of-line within the
>>>> function, rather
>>>> than in the .fixup section, which improves backtraces.
>>>> 
>>>> Because the macro is used multiple times, the fault label needs
>>>> declaring as
>>>> local.
>>>> 
>>>> No functional change.
>>>> 
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> ---
>>>> CC: Jan Beulich <JBeulich@suse.com>
>>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>>> 
>>>> Slightly RFC.  I haven't checked if Eclair will be happy with
>>>> __label__ yet.
>>> 
>>> Even if it is, I guess you'd need to update the list of extensions we
>>> use (docs/misra/C-language-toolchain.rst)?
>> 
>> Only for using the __label__ token in
>> automation/eclair_analysis/ECLAIR/toolchain.ecl. The extension itself 
>> is
>> already documented in 5590c7e6590d ("eclair: allow and document use of
>> GCC extension for label addresses")
> 
> Except that it's not the address taking that is the point in question 
> here.
> We have meanwhile gained a number of asm-goto (and for the uses there 
> I'm
> not even sure they count as "address taking"). It's really the 
> __label__
> extended keyword (and the thus possible declaration of a 
> scope-restricted
> label) that my remark was about. But yes, toolchain.ecl looks to need a
> change, too.
> 

You're right, it needs also section 6.2 "Locally Declared Labels". Both 
easy to add if needed.

-- 
Nicola Vetrini, B.Sc.
Software Engineer
BUGSENG (https://bugseng.com)
LinkedIn: https://www.linkedin.com/in/nicola-vetrini-a42471253