x86/msr: Cleanup and FRED prep

[PATCH v2 02/16] x86/msr: Rework rdmsr_safe() using asm goto()

Posted by Andrew Cooper 2 months, 2 weeks ago

... on capable toolchains.

This avoids needing to hold rc in a register across the RDMSR, and in most
cases removes direct testing and branching based on rc, as the fault label can
be rearranged to directly land on the out-of-line block.

There is a subtle difference in behaviour.  The old behaviour would, on fault,
still produce 0's and write to val.

The new behaviour only writes val on success, and write_msr() is the only
place where this matters.  Move temp out of switch() scope and initialise it
to 0.

Resolves: https://gitlab.com/xen-project/xen/-/work_items/217
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

Doing this safely does depend on getting GCC-14 into CI somewhere.  Debian
13/Trixie satisfies this, as does archlinux I expect.
---
 xen/arch/x86/include/asm/msr.h | 19 +++++++++++++++++++
 xen/arch/x86/pv/emul-priv-op.c |  3 +--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h
index d2c86ddb09e9..6a97b41bae07 100644
--- a/xen/arch/x86/include/asm/msr.h
+++ b/xen/arch/x86/include/asm/msr.h
@@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
 /* rdmsr with exception handling */
 static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
 {
+#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
+    uint64_t lo, hi;
+
+    asm_inline goto (
+        "1: rdmsr\n\t"
+        _ASM_EXTABLE(1b, %l[fault])
+        : "=a" (lo), "=d" (hi)
+        : "c" (msr)
+        :
+        : fault );
+
+    *val = lo | (hi << 32);
+
+    return 0;
+
+ fault:
+    return -EFAULT;
+#else
     int rc;
     uint64_t lo, hi;
 
@@ -73,6 +91,7 @@ static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
     *val = lo | (hi << 32);
 
     return rc;
+#endif
 }
 
 /* wrmsr with exception handling */
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 4afbee59e53e..c3a484c50bf8 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -1027,6 +1027,7 @@ static int cf_check write_msr(
     struct vcpu *curr = current;
     const struct domain *currd = curr->domain;
     const struct cpu_policy *cp = currd->arch.cpu_policy;
+    uint64_t temp = 0;
     bool vpmu_msr = false;
     int ret;
 
@@ -1040,8 +1041,6 @@ static int cf_check write_msr(
 
     switch ( reg )
     {
-        uint64_t temp;
-
     case MSR_FS_BASE:
     case MSR_GS_BASE:
     case MSR_SHADOW_GS_BASE:
-- 
2.39.5

Re: [PATCH v2 02/16] x86/msr: Rework rdmsr_safe() using asm goto()

Posted by Jan Beulich 2 months, 1 week ago

On 15.08.2025 22:41, Andrew Cooper wrote:
> ... on capable toolchains.
> 
> This avoids needing to hold rc in a register across the RDMSR, and in most
> cases removes direct testing and branching based on rc, as the fault label can
> be rearranged to directly land on the out-of-line block.
> 
> There is a subtle difference in behaviour.  The old behaviour would, on fault,
> still produce 0's and write to val.
> 
> The new behaviour only writes val on success, and write_msr() is the only
> place where this matters.  Move temp out of switch() scope and initialise it
> to 0.

But what's the motivation behind making this behavioral change? At least in
the cases where the return value isn't checked, it would feel safer if we
continued clearing the value. Even if in all cases where this could matter
(besides the one you cover here) one can prove correctness by looking at
surrounding code.

> --- a/xen/arch/x86/include/asm/msr.h
> +++ b/xen/arch/x86/include/asm/msr.h
> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
>  /* rdmsr with exception handling */
>  static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
>  {
> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
> +    uint64_t lo, hi;

Could at least this line move ahead of the #ifdef, to also cover ...

> +    asm_inline goto (
> +        "1: rdmsr\n\t"
> +        _ASM_EXTABLE(1b, %l[fault])
> +        : "=a" (lo), "=d" (hi)
> +        : "c" (msr)
> +        :
> +        : fault );
> +
> +    *val = lo | (hi << 32);
> +
> +    return 0;
> +
> + fault:
> +    return -EFAULT;
> +#else
>      int rc;
>      uint64_t lo, hi;

... the same being needed here?

Jan

Re: [PATCH v2 02/16] x86/msr: Rework rdmsr_safe() using asm goto()

Posted by Andrew Cooper 2 months, 1 week ago

On 18/08/2025 12:27 pm, Jan Beulich wrote:
> On 15.08.2025 22:41, Andrew Cooper wrote:
>> ... on capable toolchains.
>>
>> This avoids needing to hold rc in a register across the RDMSR, and in most
>> cases removes direct testing and branching based on rc, as the fault label can
>> be rearranged to directly land on the out-of-line block.
>>
>> There is a subtle difference in behaviour.  The old behaviour would, on fault,
>> still produce 0's and write to val.
>>
>> The new behaviour only writes val on success, and write_msr() is the only
>> place where this matters.  Move temp out of switch() scope and initialise it
>> to 0.
> But what's the motivation behind making this behavioral change? At least in
> the cases where the return value isn't checked, it would feel safer if we
> continued clearing the value. Even if in all cases where this could matter
> (besides the one you cover here) one can prove correctness by looking at
> surrounding code.

I didn't realise I'd made a change at first, but it's a consequence of
the compiler's ability to rearrange basic blocks.

It can be fixed with ...

>
>> --- a/xen/arch/x86/include/asm/msr.h
>> +++ b/xen/arch/x86/include/asm/msr.h
>> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
>>  /* rdmsr with exception handling */
>>  static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
>>  {
>> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
>> +    uint64_t lo, hi;
>> +    asm_inline goto (
>> +        "1: rdmsr\n\t"
>> +        _ASM_EXTABLE(1b, %l[fault])
>> +        : "=a" (lo), "=d" (hi)
>> +        : "c" (msr)
>> +        :
>> +        : fault );
>> +
>> +    *val = lo | (hi << 32);
>> +
>> +    return 0;
>> +
>> + fault:

    *val = 0;

here, but I don't want to do this.  Because val is by pointer and
generally spilled to the stack, the compiler can't optimise away the store.

I'd far rather get a real compiler error, than to have logic relying on
the result of a faulting MSR read.

>> +    return -EFAULT;
>> +#else
>>      int rc;
>>      uint64_t lo, hi;
> ... the same being needed here?

Fixed.

~Andrew

Re: [PATCH v2 02/16] x86/msr: Rework rdmsr_safe() using asm goto()

Posted by Jan Beulich 2 months, 1 week ago

On 19.08.2025 15:52, Andrew Cooper wrote:
> On 18/08/2025 12:27 pm, Jan Beulich wrote:
>> On 15.08.2025 22:41, Andrew Cooper wrote:
>>> ... on capable toolchains.
>>>
>>> This avoids needing to hold rc in a register across the RDMSR, and in most
>>> cases removes direct testing and branching based on rc, as the fault label can
>>> be rearranged to directly land on the out-of-line block.
>>>
>>> There is a subtle difference in behaviour.  The old behaviour would, on fault,
>>> still produce 0's and write to val.
>>>
>>> The new behaviour only writes val on success, and write_msr() is the only
>>> place where this matters.  Move temp out of switch() scope and initialise it
>>> to 0.
>> But what's the motivation behind making this behavioral change? At least in
>> the cases where the return value isn't checked, it would feel safer if we
>> continued clearing the value. Even if in all cases where this could matter
>> (besides the one you cover here) one can prove correctness by looking at
>> surrounding code.
> 
> I didn't realise I'd made a change at first, but it's a consequence of
> the compiler's ability to rearrange basic blocks.
> 
> It can be fixed with ...
> 
>>
>>> --- a/xen/arch/x86/include/asm/msr.h
>>> +++ b/xen/arch/x86/include/asm/msr.h
>>> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
>>>  /* rdmsr with exception handling */
>>>  static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
>>>  {
>>> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
>>> +    uint64_t lo, hi;
>>> +    asm_inline goto (
>>> +        "1: rdmsr\n\t"
>>> +        _ASM_EXTABLE(1b, %l[fault])
>>> +        : "=a" (lo), "=d" (hi)
>>> +        : "c" (msr)
>>> +        :
>>> +        : fault );
>>> +
>>> +    *val = lo | (hi << 32);
>>> +
>>> +    return 0;
>>> +
>>> + fault:
> 
>     *val = 0;
> 
> here, but I don't want to do this.  Because val is by pointer and
> generally spilled to the stack, the compiler can't optimise away the store.

But the compiler is dealing with such indirection in inline functions just
fine. I don't expect it would typically spill val to the stack. Is there
anything specific here that you think would make this more likely?

> I'd far rather get a real compiler error, than to have logic relying on
> the result of a faulting MSR read.

A compiler error? (Hmm, perhaps you think of uninitialized variable
diagnostics. That may or may not trigger, depending on how else the
caller's variable is used.)

Jan

Re: [PATCH v2 02/16] x86/msr: Rework rdmsr_safe() using asm goto()

Posted by Andrew Cooper 2 months, 1 week ago

On 19/08/2025 5:23 pm, Jan Beulich wrote:
> On 19.08.2025 15:52, Andrew Cooper wrote:
>> On 18/08/2025 12:27 pm, Jan Beulich wrote:
>>> On 15.08.2025 22:41, Andrew Cooper wrote:
>>>> ... on capable toolchains.
>>>>
>>>> This avoids needing to hold rc in a register across the RDMSR, and in most
>>>> cases removes direct testing and branching based on rc, as the fault label can
>>>> be rearranged to directly land on the out-of-line block.
>>>>
>>>> There is a subtle difference in behaviour.  The old behaviour would, on fault,
>>>> still produce 0's and write to val.
>>>>
>>>> The new behaviour only writes val on success, and write_msr() is the only
>>>> place where this matters.  Move temp out of switch() scope and initialise it
>>>> to 0.
>>> But what's the motivation behind making this behavioral change? At least in
>>> the cases where the return value isn't checked, it would feel safer if we
>>> continued clearing the value. Even if in all cases where this could matter
>>> (besides the one you cover here) one can prove correctness by looking at
>>> surrounding code.
>> I didn't realise I'd made a change at first, but it's a consequence of
>> the compiler's ability to rearrange basic blocks.
>>
>> It can be fixed with ...
>>
>>>> --- a/xen/arch/x86/include/asm/msr.h
>>>> +++ b/xen/arch/x86/include/asm/msr.h
>>>> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
>>>>  /* rdmsr with exception handling */
>>>>  static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
>>>>  {
>>>> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
>>>> +    uint64_t lo, hi;
>>>> +    asm_inline goto (
>>>> +        "1: rdmsr\n\t"
>>>> +        _ASM_EXTABLE(1b, %l[fault])
>>>> +        : "=a" (lo), "=d" (hi)
>>>> +        : "c" (msr)
>>>> +        :
>>>> +        : fault );
>>>> +
>>>> +    *val = lo | (hi << 32);
>>>> +
>>>> +    return 0;
>>>> +
>>>> + fault:
>>     *val = 0;
>>
>> here, but I don't want to do this.  Because val is by pointer and
>> generally spilled to the stack, the compiler can't optimise away the store.
> But the compiler is dealing with such indirection in inline functions just
> fine. I don't expect it would typically spill val to the stack. Is there
> anything specific here that you think would make this more likely?

Yes.  The design of the functions they're used in.  Adding this line
results in:

add/remove: 0/0 grow/shrink: 7/2 up/down: 109/-36 (73)
Function                                     old     new   delta
read_msr                                    1243    1307     +64
resource_access                              326     341     +15
hwp_init_msrs.cold                           297     308     +11
probe_cpuid_faulting                         168     175      +7
svm_msr_read_intercept                      1034    1039      +5
hwp_write_request                            113     117      +4
hwp_init_msrs                                371     374      +3
amd_log_freq                                 844     828     -16
guest_rdmsr                                 2168    2148     -20

Taking read_msr() as a concrete example, this is because it's a store
into a parent functions variable, not into a local variable, and cannot
be elided.


>
>> I'd far rather get a real compiler error, than to have logic relying on
>> the result of a faulting MSR read.
> A compiler error? (Hmm, perhaps you think of uninitialized variable
> diagnostics. That may or may not trigger, depending on how else the
> caller's variable is used.)

Yes I was referring to the uninitialised variable diagnostic.  *_safe()
are fairly rare, and we've got plenty of coverage in CI.

~Andrew

Re: [PATCH v2 02/16] x86/msr: Rework rdmsr_safe() using asm goto()

Posted by Jan Beulich 2 months, 1 week ago

On 21.08.2025 18:20, Andrew Cooper wrote:
> On 19/08/2025 5:23 pm, Jan Beulich wrote:
>> On 19.08.2025 15:52, Andrew Cooper wrote:
>>> On 18/08/2025 12:27 pm, Jan Beulich wrote:
>>>> On 15.08.2025 22:41, Andrew Cooper wrote:
>>>>> ... on capable toolchains.
>>>>>
>>>>> This avoids needing to hold rc in a register across the RDMSR, and in most
>>>>> cases removes direct testing and branching based on rc, as the fault label can
>>>>> be rearranged to directly land on the out-of-line block.
>>>>>
>>>>> There is a subtle difference in behaviour.  The old behaviour would, on fault,
>>>>> still produce 0's and write to val.
>>>>>
>>>>> The new behaviour only writes val on success, and write_msr() is the only
>>>>> place where this matters.  Move temp out of switch() scope and initialise it
>>>>> to 0.
>>>> But what's the motivation behind making this behavioral change? At least in
>>>> the cases where the return value isn't checked, it would feel safer if we
>>>> continued clearing the value. Even if in all cases where this could matter
>>>> (besides the one you cover here) one can prove correctness by looking at
>>>> surrounding code.
>>> I didn't realise I'd made a change at first, but it's a consequence of
>>> the compiler's ability to rearrange basic blocks.
>>>
>>> It can be fixed with ...
>>>
>>>>> --- a/xen/arch/x86/include/asm/msr.h
>>>>> +++ b/xen/arch/x86/include/asm/msr.h
>>>>> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
>>>>>  /* rdmsr with exception handling */
>>>>>  static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
>>>>>  {
>>>>> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
>>>>> +    uint64_t lo, hi;
>>>>> +    asm_inline goto (
>>>>> +        "1: rdmsr\n\t"
>>>>> +        _ASM_EXTABLE(1b, %l[fault])
>>>>> +        : "=a" (lo), "=d" (hi)
>>>>> +        : "c" (msr)
>>>>> +        :
>>>>> +        : fault );
>>>>> +
>>>>> +    *val = lo | (hi << 32);
>>>>> +
>>>>> +    return 0;
>>>>> +
>>>>> + fault:
>>>     *val = 0;
>>>
>>> here, but I don't want to do this.  Because val is by pointer and
>>> generally spilled to the stack, the compiler can't optimise away the store.
>> But the compiler is dealing with such indirection in inline functions just
>> fine. I don't expect it would typically spill val to the stack. Is there
>> anything specific here that you think would make this more likely?
> 
> Yes.  The design of the functions they're used in.  Adding this line
> results in:
> 
> add/remove: 0/0 grow/shrink: 7/2 up/down: 109/-36 (73)
> Function                                     old     new   delta
> read_msr                                    1243    1307     +64
> resource_access                              326     341     +15
> hwp_init_msrs.cold                           297     308     +11
> probe_cpuid_faulting                         168     175      +7
> svm_msr_read_intercept                      1034    1039      +5
> hwp_write_request                            113     117      +4
> hwp_init_msrs                                371     374      +3
> amd_log_freq                                 844     828     -16
> guest_rdmsr                                 2168    2148     -20
> 
> Taking read_msr() as a concrete example, this is because it's a store
> into a parent functions variable, not into a local variable, and cannot
> be elided.
> 
> 
>>
>>> I'd far rather get a real compiler error, than to have logic relying on
>>> the result of a faulting MSR read.
>> A compiler error? (Hmm, perhaps you think of uninitialized variable
>> diagnostics. That may or may not trigger, depending on how else the
>> caller's variable is used.)
> 
> Yes I was referring to the uninitialised variable diagnostic.  *_safe()
> are fairly rare, and we've got plenty of coverage in CI.

Well, okay, slightly hesitantly
Reviewed-by: Jan Beulich <jbeulich@suse.com>
preferably with the paragraph in the description that I commented on
slightly expanded to cover the "why" aspect.

Jan