Unlike the WRMSR side, we can't use asm goto() unconditionally, because our
toolchain baseline doesn't support asm goto with outputs.
Detect support, and use asm goto() when available.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
Condensing these two to a single Kconfig symbol is prohibitively difficult.
Also, there's a different errata workaround we'll need if we want to use asm
goto() with "+" constraints:
config CC_HAS_ASM_GOTO_TIED_OUTPUT
depends on CC_HAS_ASM_GOTO_OUTPUT
# Detect buggy gcc and clang, fixed in gcc-11 clang-14.
def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - .": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o /dev/null)
I'm tempted to put it in straight away, lest we forget about it.
---
xen/Kconfig | 14 ++++++++++++++
xen/arch/x86/cpu/amd.c | 13 +++++++++++++
2 files changed, 27 insertions(+)
diff --git a/xen/Kconfig b/xen/Kconfig
index 2128f0ccfc0b..a367337cafa5 100644
--- a/xen/Kconfig
+++ b/xen/Kconfig
@@ -41,6 +41,20 @@ config CC_SPLIT_SECTIONS
config CC_HAS_UBSAN
def_bool $(cc-option,-fsanitize=undefined)
+# Fixed in GCC 14, 13.3, 12.4 and 11.5
+# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
+config GCC_ASM_GOTO_OUTPUT_BROKEN
+ bool
+ depends on CC_IS_GCC
+ default y if GCC_VERSION < 110500
+ default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
+ default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300
+
+config CC_HAS_ASM_GOTO_OUTPUT
+ def_bool y
+ depends on !GCC_ASM_GOTO_OUTPUT_BROKEN
+ depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
+
# Set code alignment.
#
# Allow setting on a boolean basis, and then convert such selection to an
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index ce4e1df71064..37d67dd15c89 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -59,6 +59,18 @@ static bool __read_mostly fam17_c6_disabled;
static inline int rdmsr_amd_safe(unsigned int msr, unsigned int *lo,
unsigned int *hi)
{
+#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
+ asm goto ( "1: rdmsr\n\t"
+ _ASM_EXTABLE(1b, %l[fault])
+ : "=a" (*lo), "=d" (*hi)
+ : "c" (msr), "D" (0x9c5a203a)
+ :
+ : fault );
+ return 0;
+
+ fault:
+ return -EFAULT;
+#else
int err;
asm volatile("1: rdmsr\n2:\n"
@@ -71,6 +83,7 @@ static inline int rdmsr_amd_safe(unsigned int msr, unsigned int *lo,
: "c" (msr), "D" (0x9c5a203a), "2" (0), "i" (-EFAULT));
return err;
+#endif
}
static inline int wrmsr_amd_safe(unsigned int msr, unsigned int lo,
base-commit: a235f856e4bbd270b085590e1f5fc9599234dcdf
prerequisite-patch-id: 2874d23382cbbc2ec8eaad596b6f49f14b990b8a
prerequisite-patch-id: fdc7cdec843e4d24fc64006c977fa83a55d98910
--
2.39.5
On 07.04.2025 17:35, Andrew Cooper wrote:
> Unlike the WRMSR side, we can't use asm goto() unconditionally, because our
> toolchain baseline doesn't support asm goto with outputs.
Is there actually a benefit we gain from now needing to maintain two different
pieces of logic fulfilling the same purpose?
> Also, there's a different errata workaround we'll need if we want to use asm
> goto() with "+" constraints:
>
> config CC_HAS_ASM_GOTO_TIED_OUTPUT
> depends on CC_HAS_ASM_GOTO_OUTPUT
> # Detect buggy gcc and clang, fixed in gcc-11 clang-14.
> def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - .": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o /dev/null)
>
> I'm tempted to put it in straight away, lest we forget about it.
Perhaps best if we really want to go this route. Yet then - why "TIED"? Isn't
"tied" the term they use when referring to an earlier operand by using a
digit (or the operand's name in square brackets)?
> --- a/xen/Kconfig
> +++ b/xen/Kconfig
> @@ -41,6 +41,20 @@ config CC_SPLIT_SECTIONS
> config CC_HAS_UBSAN
> def_bool $(cc-option,-fsanitize=undefined)
>
> +# Fixed in GCC 14, 13.3, 12.4 and 11.5
> +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
> +config GCC_ASM_GOTO_OUTPUT_BROKEN
> + bool
> + depends on CC_IS_GCC
> + default y if GCC_VERSION < 110500
> + default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
> + default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300
Unlike for pre-release versions (x.0.y) I view this as problematic. Distros
are likely to have backported the fix before the minor releases took place.
Or they may have backported without ever meaning to follow later minor
releases. We'd needlessly exclude them here. Imo ...
> +config CC_HAS_ASM_GOTO_OUTPUT
> + def_bool y
> + depends on !GCC_ASM_GOTO_OUTPUT_BROKEN
> + depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
... the only option is to actually probe for support as well as the (non-)
buggy-ness.
Jan
On 07/04/2025 4:48 pm, Jan Beulich wrote:
> On 07.04.2025 17:35, Andrew Cooper wrote:
>> Unlike the WRMSR side, we can't use asm goto() unconditionally, because our
>> toolchain baseline doesn't support asm goto with outputs.
> Is there actually a benefit we gain from now needing to maintain two different
> pieces of logic fulfilling the same purpose?
IMO, yes. Besides getting rid of .fixup/unlikely, the code generation
is better-enough to warrant it, including getting the common path
correct (the referenced labels are all considered cold).
e.g. for this change, we go from:
xor %esi,%esi
rdmsr
test %esi,%esi
jne <init_amd+0x540>
and $0xfffffffe,%edx
wrmsr
(note the forward branch) to simply:
rdmsr
and $0xfffffffe,%edx
wrmsr
because the exception table redirect is directly into init_amd.cold, and
we don't have to hold `int err` in a register across the asm() block.
This is an intentionally simple example to get the infrastructure in,
but vmread() will definitely benefit.
>
>> Also, there's a different errata workaround we'll need if we want to use asm
>> goto() with "+" constraints:
>>
>> config CC_HAS_ASM_GOTO_TIED_OUTPUT
>> depends on CC_HAS_ASM_GOTO_OUTPUT
>> # Detect buggy gcc and clang, fixed in gcc-11 clang-14.
>> def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - .": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o /dev/null)
>>
>> I'm tempted to put it in straight away, lest we forget about it.
> Perhaps best if we really want to go this route. Yet then - why "TIED"? Isn't
> "tied" the term they use when referring to an earlier operand by using a
> digit (or the operand's name in square brackets)?
This is straight from Linux. I've not looked at the issue in detail.
>
>> --- a/xen/Kconfig
>> +++ b/xen/Kconfig
>> @@ -41,6 +41,20 @@ config CC_SPLIT_SECTIONS
>> config CC_HAS_UBSAN
>> def_bool $(cc-option,-fsanitize=undefined)
>>
>> +# Fixed in GCC 14, 13.3, 12.4 and 11.5
>> +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
>> +config GCC_ASM_GOTO_OUTPUT_BROKEN
>> + bool
>> + depends on CC_IS_GCC
>> + default y if GCC_VERSION < 110500
>> + default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
>> + default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300
> Unlike for pre-release versions (x.0.y) I view this as problematic. Distros
> are likely to have backported the fix before the minor releases took place.
> Or they may have backported without ever meaning to follow later minor
> releases. We'd needlessly exclude them here. Imo ...
>
>> +config CC_HAS_ASM_GOTO_OUTPUT
>> + def_bool y
>> + depends on !GCC_ASM_GOTO_OUTPUT_BROKEN
>> + depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
> ... the only option is to actually probe for support as well as the (non-)
> buggy-ness.
There is no sensible way to probe. It compiles fine, but (AIUI) fails
to spill registers correctly on some paths, which also makes it very
sensitive to other optimisations.
~Andrew
On 17.04.2025 19:24, Andrew Cooper wrote:
> On 07/04/2025 4:48 pm, Jan Beulich wrote:
>> On 07.04.2025 17:35, Andrew Cooper wrote:
>>> Unlike the WRMSR side, we can't use asm goto() unconditionally, because our
>>> toolchain baseline doesn't support asm goto with outputs.
>> Is there actually a benefit we gain from now needing to maintain two different
>> pieces of logic fulfilling the same purpose?
>
> IMO, yes. Besides getting rid of .fixup/unlikely, the code generation
> is better-enough to warrant it, including getting the common path
> correct (the referenced labels are all considered cold).
>
> e.g. for this change, we go from:
>
> xor %esi,%esi
> rdmsr
> test %esi,%esi
> jne <init_amd+0x540>
> and $0xfffffffe,%edx
> wrmsr
>
> (note the forward branch) to simply:
>
> rdmsr
> and $0xfffffffe,%edx
> wrmsr
>
> because the exception table redirect is directly into init_amd.cold, and
> we don't have to hold `int err` in a register across the asm() block.
>
> This is an intentionally simple example to get the infrastructure in,
> but vmread() will definitely benefit.
>
>>
>>> Also, there's a different errata workaround we'll need if we want to use asm
>>> goto() with "+" constraints:
>>>
>>> config CC_HAS_ASM_GOTO_TIED_OUTPUT
>>> depends on CC_HAS_ASM_GOTO_OUTPUT
>>> # Detect buggy gcc and clang, fixed in gcc-11 clang-14.
>>> def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - .": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o /dev/null)
>>>
>>> I'm tempted to put it in straight away, lest we forget about it.
>> Perhaps best if we really want to go this route. Yet then - why "TIED"? Isn't
>> "tied" the term they use when referring to an earlier operand by using a
>> digit (or the operand's name in square brackets)?
>
> This is straight from Linux. I've not looked at the issue in detail.
So what I see is that Sean in Linux commit 1aa0e8b144b6 uses this term also
in the description. I'm unconvinced it's correct, though. Gcc doc doesn't
call the "+" modifier anything special, and it calls the numeric constraints
(for which "+" can be a shorthand in certain cases) "matching constraint".
We can of course sort the naming in the eventual patch pulling in that
behavior, yet I'd like to suggest already now that we don't blindly follow
Linux'es naming (unless the choice can be backed by some doc reference).
>>> --- a/xen/Kconfig
>>> +++ b/xen/Kconfig
>>> @@ -41,6 +41,20 @@ config CC_SPLIT_SECTIONS
>>> config CC_HAS_UBSAN
>>> def_bool $(cc-option,-fsanitize=undefined)
>>>
>>> +# Fixed in GCC 14, 13.3, 12.4 and 11.5
>>> +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
>>> +config GCC_ASM_GOTO_OUTPUT_BROKEN
>>> + bool
>>> + depends on CC_IS_GCC
>>> + default y if GCC_VERSION < 110500
>>> + default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
>>> + default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300
>> Unlike for pre-release versions (x.0.y) I view this as problematic. Distros
>> are likely to have backported the fix before the minor releases took place.
>> Or they may have backported without ever meaning to follow later minor
>> releases. We'd needlessly exclude them here. Imo ...
>>
>>> +config CC_HAS_ASM_GOTO_OUTPUT
>>> + def_bool y
>>> + depends on !GCC_ASM_GOTO_OUTPUT_BROKEN
>>> + depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
>> ... the only option is to actually probe for support as well as the (non-)
>> buggy-ness.
>
> There is no sensible way to probe. It compiles fine, but (AIUI) fails
> to spill registers correctly on some paths, which also makes it very
> sensitive to other optimisations.
Hmm, okay, Linux commit f2f6a8e88717 kind of suggests that there might have
been more issues in gcc. Really I can't help the impression that the issue
still wasn't fully understood, and hence may re-surface in another context.
In which case I guess I agree the above is the best we can do for the time
being, until we learn of further breakage:
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan
On 22/04/2025 10:07 am, Jan Beulich wrote:
> On 17.04.2025 19:24, Andrew Cooper wrote:
>> On 07/04/2025 4:48 pm, Jan Beulich wrote:
>>> On 07.04.2025 17:35, Andrew Cooper wrote:
>>>> Unlike the WRMSR side, we can't use asm goto() unconditionally, because our
>>>> toolchain baseline doesn't support asm goto with outputs.
>>> Is there actually a benefit we gain from now needing to maintain two different
>>> pieces of logic fulfilling the same purpose?
>> IMO, yes. Besides getting rid of .fixup/unlikely, the code generation
>> is better-enough to warrant it, including getting the common path
>> correct (the referenced labels are all considered cold).
>>
>> e.g. for this change, we go from:
>>
>> xor %esi,%esi
>> rdmsr
>> test %esi,%esi
>> jne <init_amd+0x540>
>> and $0xfffffffe,%edx
>> wrmsr
>>
>> (note the forward branch) to simply:
>>
>> rdmsr
>> and $0xfffffffe,%edx
>> wrmsr
>>
>> because the exception table redirect is directly into init_amd.cold, and
>> we don't have to hold `int err` in a register across the asm() block.
>>
>> This is an intentionally simple example to get the infrastructure in,
>> but vmread() will definitely benefit.
>>
>>>> Also, there's a different errata workaround we'll need if we want to use asm
>>>> goto() with "+" constraints:
>>>>
>>>> config CC_HAS_ASM_GOTO_TIED_OUTPUT
>>>> depends on CC_HAS_ASM_GOTO_OUTPUT
>>>> # Detect buggy gcc and clang, fixed in gcc-11 clang-14.
>>>> def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - .": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o /dev/null)
>>>>
>>>> I'm tempted to put it in straight away, lest we forget about it.
>>> Perhaps best if we really want to go this route. Yet then - why "TIED"? Isn't
>>> "tied" the term they use when referring to an earlier operand by using a
>>> digit (or the operand's name in square brackets)?
>> This is straight from Linux. I've not looked at the issue in detail.
> So what I see is that Sean in Linux commit 1aa0e8b144b6 uses this term also
> in the description. I'm unconvinced it's correct, though. Gcc doc doesn't
> call the "+" modifier anything special, and it calls the numeric constraints
> (for which "+" can be a shorthand in certain cases) "matching constraint".
>
> We can of course sort the naming in the eventual patch pulling in that
> behavior, yet I'd like to suggest already now that we don't blindly follow
> Linux'es naming (unless the choice can be backed by some doc reference).
In the meantime, I found
https://clang.llvm.org/docs/LanguageExtensions.html#asm-goto-with-output-constraints
"When using tied-outputs (i.e. outputs that are inputs and outputs, not
just outputs) with the +r constraint"
So Clang does name this case specifically as tied outputs.
>
>>>> --- a/xen/Kconfig
>>>> +++ b/xen/Kconfig
>>>> @@ -41,6 +41,20 @@ config CC_SPLIT_SECTIONS
>>>> config CC_HAS_UBSAN
>>>> def_bool $(cc-option,-fsanitize=undefined)
>>>>
>>>> +# Fixed in GCC 14, 13.3, 12.4 and 11.5
>>>> +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
>>>> +config GCC_ASM_GOTO_OUTPUT_BROKEN
>>>> + bool
>>>> + depends on CC_IS_GCC
>>>> + default y if GCC_VERSION < 110500
>>>> + default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
>>>> + default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300
>>> Unlike for pre-release versions (x.0.y) I view this as problematic. Distros
>>> are likely to have backported the fix before the minor releases took place.
>>> Or they may have backported without ever meaning to follow later minor
>>> releases. We'd needlessly exclude them here. Imo ...
>>>
>>>> +config CC_HAS_ASM_GOTO_OUTPUT
>>>> + def_bool y
>>>> + depends on !GCC_ASM_GOTO_OUTPUT_BROKEN
>>>> + depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
>>> ... the only option is to actually probe for support as well as the (non-)
>>> buggy-ness.
>> There is no sensible way to probe. It compiles fine, but (AIUI) fails
>> to spill registers correctly on some paths, which also makes it very
>> sensitive to other optimisations.
> Hmm, okay, Linux commit f2f6a8e88717 kind of suggests that there might have
> been more issues in gcc. Really I can't help the impression that the issue
> still wasn't fully understood, and hence may re-surface in another context.
> In which case I guess I agree the above is the best we can do for the time
> being, until we learn of further breakage:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Thanks.
~Andrew
© 2016 - 2025 Red Hat, Inc.