[PATCH] x86: correct asm() constraints when dealing with immediate selector values

Jan Beulich posted 1 patch 2 years, 6 months ago
Test gitlab-ci passed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/87278164-2395-1fb4-7569-9151d0151e8a@suse.com
There is a newer version of this series
[PATCH] x86: correct asm() constraints when dealing with immediate selector values
Posted by Jan Beulich 2 years, 6 months ago
asm() constraints need to fit both the intended insn(s) which the
respective operands are going to be used with as well as the actual kind
of value specified. "m" (alone) together with a constant, however, leads
to gcc saying

error: memory input <N> is not directly addressable

while clang complains

error: invalid lvalue in asm input for constraint 'm'

And rightly so - in order to access a memory operand, an address needs
to be specified to the insn. In some cases it might be possible for a
compiler to synthesize a memory operand holding the requested constant,
but I think any solution there would have sharp edges.

If "m" alone doesn't work with constants, it is at best pointless (and
perhaps misleading or even risky - the compiler might decide to actually
pick "m" and not try very hard to find a suitable register) to specify
it alongside "r".

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -736,7 +736,7 @@ void __init detect_zen2_null_seg_behavio
 	uint64_t base;
 
 	wrmsrl(MSR_FS_BASE, 1);
-	asm volatile ( "mov %0, %%fs" :: "rm" (0) );
+	asm volatile ( "mov %0, %%fs" :: "r" (0) );
 	rdmsrl(MSR_FS_BASE, base);
 
 	if (base == 0)
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -248,7 +248,7 @@ void do_double_fault(struct cpu_user_reg
 
     console_force_unlock();
 
-    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_SELECTOR) );
+    asm ( "lsll %1, %0" : "=r" (cpu) : "r" (PER_CPU_SELECTOR) );
 
     /* Find information saved during fault and dump it to the console. */
     printk("*** DOUBLE FAULT ***\n");


Re: [PATCH] x86: correct asm() constraints when dealing with immediate selector values
Posted by Andrew Cooper 2 years, 6 months ago
On 09/09/2021 15:56, Jan Beulich wrote:
> asm() constraints need to fit both the intended insn(s) which the
> respective operands are going to be used with as well as the actual kind
> of value specified. "m" (alone) together with a constant, however, leads
> to gcc saying
>
> error: memory input <N> is not directly addressable
>
> while clang complains
>
> error: invalid lvalue in asm input for constraint 'm'
>
> And rightly so - in order to access a memory operand, an address needs
> to be specified to the insn. In some cases it might be possible for a
> compiler to synthesize a memory operand holding the requested constant,
> but I think any solution there would have sharp edges.

It's precisely what happens in the other uses of constants which you
haven't adjusted below.  Constants are fine if being propagated into a
static inline which has properly typed parameters.

Or are you saying automatic spilling when a width isn't necessarily known?

> If "m" alone doesn't work with constants, it is at best pointless (and
> perhaps misleading or even risky - the compiler might decide to actually
> pick "m" and not try very hard to find a suitable register) to specify
> it alongside "r".
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I'm slightly surprised you didn't spot and comment about what Clang does
with this.

https://godbolt.org/z/M4nrerrWM

For "rm" (0), clang really does spill the constant into a global and
generate a rip-relative mov to fs, which is especially funny considering
the rejection of "m" as a constraint.

Clang even spills "rm" (local) into a global, while "m" (local) does
come from the stack.


I think there is a reasonable argument to say "m" (const) doesn't have
enough type (width) information for a compiler to do anything sensible
with, and therefore it is fair to be dropped.

But for "rm" (var), where there is enough width information, I don't
think it is reasonable to drop the "m" and restrict the flexibility.

Furthermore, I'm going to argue that we shouldn't work around this
behaviour by removing "m" elsewhere.  This code generation
bug/misfeature affects every use of "rm", even when the referenced
operand is on the stack and can be used without unspilling first.

>
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -736,7 +736,7 @@ void __init detect_zen2_null_seg_behavio
>  	uint64_t base;
>  
>  	wrmsrl(MSR_FS_BASE, 1);
> -	asm volatile ( "mov %0, %%fs" :: "rm" (0) );
> +	asm volatile ( "mov %0, %%fs" :: "r" (0) );
>  	rdmsrl(MSR_FS_BASE, base);
>  
>  	if (base == 0)
> --- a/xen/arch/x86/x86_64/traps.c
> +++ b/xen/arch/x86/x86_64/traps.c
> @@ -248,7 +248,7 @@ void do_double_fault(struct cpu_user_reg
>  
>      console_force_unlock();
>  
> -    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_SELECTOR) );
> +    asm ( "lsll %1, %0" : "=r" (cpu) : "r" (PER_CPU_SELECTOR) );

Any chance we can clean this up to: lsl %[lim], %[sel] seeing as the
line is being edited?

~Andrew


Re: [PATCH] x86: correct asm() constraints when dealing with immediate selector values
Posted by Jan Beulich 2 years, 6 months ago
On 09.09.2021 21:31, Andrew Cooper wrote:
> On 09/09/2021 15:56, Jan Beulich wrote:
>> asm() constraints need to fit both the intended insn(s) which the
>> respective operands are going to be used with as well as the actual kind
>> of value specified. "m" (alone) together with a constant, however, leads
>> to gcc saying
>>
>> error: memory input <N> is not directly addressable
>>
>> while clang complains
>>
>> error: invalid lvalue in asm input for constraint 'm'
>>
>> And rightly so - in order to access a memory operand, an address needs
>> to be specified to the insn. In some cases it might be possible for a
>> compiler to synthesize a memory operand holding the requested constant,
>> but I think any solution there would have sharp edges.
> 
> It's precisely what happens in the other uses of constants which you
> haven't adjusted below.  Constants are fine if being propagated into a
> static inline which has properly typed parameters.
> 
> Or are you saying automatic spilling when a width isn't necessarily known?

The lack of width information is a secondary aspect, yes. But the primary
aspects with inline functions (as opposed to open-coded uses) is that the
inline function's parameter can be taken the address of, and hence is
both addressable (gcc) and an lvalue (clang).

>> If "m" alone doesn't work with constants, it is at best pointless (and
>> perhaps misleading or even risky - the compiler might decide to actually
>> pick "m" and not try very hard to find a suitable register) to specify
>> it alongside "r".
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I'm slightly surprised you didn't spot and comment about what Clang does
> with this.
> 
> https://godbolt.org/z/M4nrerrWM
> 
> For "rm" (0), clang really does spill the constant into a global and
> generate a rip-relative mov to fs, which is especially funny considering
> the rejection of "m" as a constraint.

I had no reason to suspect such imo entirely inconsistent behavior, so
I didn't look at the generated code.

> Clang even spills "rm" (local) into a global, while "m" (local) does
> come from the stack.

"rm" (local)? That would be racy (and hence imo a compiler bug). DYM
"rm" (constant)? Or DYM spilling onto the stack (which is what I
observe with clang5, oddly enough independent of optimization level)?

> I think there is a reasonable argument to say "m" (const) doesn't have
> enough type (width) information for a compiler to do anything sensible
> with, and therefore it is fair to be dropped.
> 
> But for "rm" (var), where there is enough width information, I don't
> think it is reasonable to drop the "m" and restrict the flexibility.

Correct, and I don't do this. What I alter are instances of "rm" (const).

> Furthermore, I'm going to argue that we shouldn't work around this
> behaviour by removing "m" elsewhere.  This code generation
> bug/misfeature affects every use of "rm", even when the referenced
> operand is on the stack and can be used without unspilling first.

As long as the generated code is correct, I agree. See above for a case
where it might actually not be.

>> --- a/xen/arch/x86/x86_64/traps.c
>> +++ b/xen/arch/x86/x86_64/traps.c
>> @@ -248,7 +248,7 @@ void do_double_fault(struct cpu_user_reg
>>  
>>      console_force_unlock();
>>  
>> -    asm ( "lsll %1, %0" : "=r" (cpu) : "rm" (PER_CPU_SELECTOR) );
>> +    asm ( "lsll %1, %0" : "=r" (cpu) : "r" (PER_CPU_SELECTOR) );
> 
> Any chance we can clean this up to: lsl %[lim], %[sel] seeing as the
> line is being edited?

Sure, no problem at all.

Jan


Re: [PATCH] x86: correct asm() constraints when dealing with immediate selector values
Posted by Jan Beulich 2 years, 6 months ago
On 09.09.2021 21:31, Andrew Cooper wrote:
> On 09/09/2021 15:56, Jan Beulich wrote:
>> asm() constraints need to fit both the intended insn(s) which the
>> respective operands are going to be used with as well as the actual kind
>> of value specified. "m" (alone) together with a constant, however, leads
>> to gcc saying
>>
>> error: memory input <N> is not directly addressable
>>
>> while clang complains
>>
>> error: invalid lvalue in asm input for constraint 'm'
>>
>> And rightly so - in order to access a memory operand, an address needs
>> to be specified to the insn. In some cases it might be possible for a
>> compiler to synthesize a memory operand holding the requested constant,
>> but I think any solution there would have sharp edges.
> 
> It's precisely what happens in the other uses of constants which you
> haven't adjusted below.  Constants are fine if being propagated into a
> static inline which has properly typed parameters.
> 
> Or are you saying automatic spilling when a width isn't necessarily known?
> 
>> If "m" alone doesn't work with constants, it is at best pointless (and
>> perhaps misleading or even risky - the compiler might decide to actually
>> pick "m" and not try very hard to find a suitable register) to specify
>> it alongside "r".
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I'm slightly surprised you didn't spot and comment about what Clang does
> with this.
> 
> https://godbolt.org/z/M4nrerrWM
> 
> For "rm" (0), clang really does spill the constant into a global and
> generate a rip-relative mov to fs, which is especially funny considering
> the rejection of "m" as a constraint.
> 
> Clang even spills "rm" (local) into a global, while "m" (local) does
> come from the stack.

Which leads to a lack of diagnosing bad code, in e.g.:

void movq(unsigned val) {
//	asm volatile ( "movq %0, %%xmm0" :: "r" (0) );
//	asm volatile ( "movq %0, %%xmm0" :: "m" (0) );
	asm volatile ( "movq %0, %%xmm0" :: "rm" (0) );
}

The "r" variant gets diagnosed (at the assembly stage, as a 32-bit GPR
is invalid as an operand to MOVQ). The "rm" variant gets compiled to a
64-bit memory access of a 32-bit memory location.

Jan