[v1] x86/entry: shrink insn size for some of our EFLAGS manipulation

[PATCH] x86/entry: shrink insn size for some of our EFLAGS manipulation

Posted by Jan Beulich 1 year, 11 months ago

Much like was recently done for setting entry vector, and along the
lines of what we already had in handle_exception_saved, avoid 32-bit
immediates where 8-bit ones do. Reduces .text.entry size by 16 bytes in
my non-CET reference build, while in my CET reference build section size
doesn't change (there and in .text only padding space increases).

Inspired by other long->byte conversion work.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Numbers above are biased by me also having the straight-line-speculation
change in the tree, thus every JMP is followed by an INT3. Without that,
.text.entry size would also shrink by 16 bytes in the CET build.

--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
         movq  %rax,TRAPBOUNCE_eip(%rdx)
         movb  %cl,TRAPBOUNCE_flags(%rdx)
         call  create_bounce_frame
-        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
+        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
 /* %rbx: struct vcpu */
 test_all_events:
         ASSERT_NOT_IN_ATOMIC
@@ -223,7 +223,7 @@ LABEL_LOCAL(.Lrestore_rcx_iret_exit_to_g
 /* No special register assumptions. */
 iret_exit_to_guest:
         andl  $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), EFRAME_eflags(%rsp)
-        orl   $X86_EFLAGS_IF, EFRAME_eflags(%rsp)
+        orb   $X86_EFLAGS_IF >> 8, EFRAME_eflags + 1(%rsp)
         addq  $8,%rsp
 .Lft0:  iretq
         _ASM_PRE_EXTABLE(.Lft0, handle_exception)
@@ -346,7 +346,7 @@ LABEL(sysenter_eflags_saved, 0)
 
         GET_STACK_END(bx)
         /* PUSHF above has saved EFLAGS.IF clear (the caller had it set). */
-        orl   $X86_EFLAGS_IF, UREGS_eflags(%rsp)
+        orb   $X86_EFLAGS_IF >> 8, UREGS_eflags + 1(%rsp)
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
         test  %rcx, %rcx
         jz    .Lsyse_cr3_okay
@@ -361,11 +361,11 @@ LABEL(sysenter_eflags_saved, 0)
         cmpb  $0,VCPU_sysenter_disables_events(%rbx)
         movq  VCPU_sysenter_addr(%rbx),%rax
         setne %cl
-        testl $X86_EFLAGS_NT,UREGS_eflags(%rsp)
+        testb $X86_EFLAGS_NT >> 8, UREGS_eflags + 1(%rsp)
         leaq  VCPU_trap_bounce(%rbx),%rdx
 UNLIKELY_START(nz, sysenter_nt_set)
         pushfq
-        andl  $~X86_EFLAGS_NT,(%rsp)
+        andb  $~(X86_EFLAGS_NT >> 8), 1(%rsp)
         popfq
 UNLIKELY_END(sysenter_nt_set)
         testq %rax,%rax

Re: [PATCH] x86/entry: shrink insn size for some of our EFLAGS manipulation

Posted by Andrew Cooper 1 year, 11 months ago

On 05/03/2024 2:04 pm, Jan Beulich wrote:
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>          call  create_bounce_frame
> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)

The other adjustments are fine, but what on earth are we doing with DF here?

This looks straight up buggy.  We've got no legitimate reason to be
playing with the guest's view of DF.

~Andrew

Re: [PATCH] x86/entry: shrink insn size for some of our EFLAGS manipulation

Posted by Jan Beulich 1 year, 11 months ago

On 06.03.2024 11:33, Andrew Cooper wrote:
> On 05/03/2024 2:04 pm, Jan Beulich wrote:
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>>          call  create_bounce_frame
>> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
>> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
> 
> The other adjustments are fine, but what on earth are we doing with DF here?
> 
> This looks straight up buggy.  We've got no legitimate reason to be
> playing with the guest's view of DF.

This is the PV equivalent of the SYSCALL_MASK MSR, isn't it? With it not
really being an (emulated) MSR, but an agreement that guests will only ever
care about having DF cleared (besides the requested way of dealing with the
event mask). One of the many things not written down anywhere ...

Jan

Re: [PATCH] x86/entry: shrink insn size for some of our EFLAGS manipulation

Posted by Andrew Cooper 1 year, 11 months ago

On 06/03/2024 10:49 am, Jan Beulich wrote:
> On 06.03.2024 11:33, Andrew Cooper wrote:
>> On 05/03/2024 2:04 pm, Jan Beulich wrote:
>>> --- a/xen/arch/x86/x86_64/entry.S
>>> +++ b/xen/arch/x86/x86_64/entry.S
>>> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>>>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>>>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>>>          call  create_bounce_frame
>>> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
>>> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
>> The other adjustments are fine, but what on earth are we doing with DF here?
>>
>> This looks straight up buggy.  We've got no legitimate reason to be
>> playing with the guest's view of DF.
> This is the PV equivalent of the SYSCALL_MASK MSR, isn't it?

Well, is it?

It definitely never existed in 32bit builds of Xen, when the int80
direct trap existed.

I don't see any evidence of it applying anywhere for compat PV guests
either, even those with syscall enabled.

> With it not
> really being an (emulated) MSR, but an agreement that guests will only ever
> care about having DF cleared (besides the requested way of dealing with the
> event mask). One of the many things not written down anywhere ...

If it's not written down, it doesn't exist...

And even if this is supposed to be a PV-FMASK-ish thing, it's buggy
because it apples also when #UD is raised for no registered callback. 
(And yes, I realise there is a chronology issues here (the #UD check is
the newer element), but it really will corrupt state as presented in a
SIGSEGV frame.

The question we need to answer is whether there is any remotely-credible
way that a PV guest kernel author could be expecting this behaviour and
relying on it.

I honestly don't think there is.

It fails the principle of least surprise compared to native behaviour,
32bit PV behaviour, and to every non-SYSCALL based 64bit event also.

It either needs writing down somewhere (and the #UD case fixing), or it
needs deleing, because continuing to leave it in this state isn't ok.

~Andrew

Re: [PATCH] x86/entry: shrink insn size for some of our EFLAGS manipulation

Posted by Jan Beulich 1 year, 11 months ago

On 06.03.2024 12:14, Andrew Cooper wrote:
> On 06/03/2024 10:49 am, Jan Beulich wrote:
>> On 06.03.2024 11:33, Andrew Cooper wrote:
>>> On 05/03/2024 2:04 pm, Jan Beulich wrote:
>>>> --- a/xen/arch/x86/x86_64/entry.S
>>>> +++ b/xen/arch/x86/x86_64/entry.S
>>>> @@ -52,7 +52,7 @@ UNLIKELY_END(syscall_no_callback)
>>>>          movq  %rax,TRAPBOUNCE_eip(%rdx)
>>>>          movb  %cl,TRAPBOUNCE_flags(%rdx)
>>>>          call  create_bounce_frame
>>>> -        andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
>>>> +        andb  $~(X86_EFLAGS_DF >> 8), UREGS_eflags + 1(%rsp)
>>> The other adjustments are fine, but what on earth are we doing with DF here?
>>>
>>> This looks straight up buggy.  We've got no legitimate reason to be
>>> playing with the guest's view of DF.
>> This is the PV equivalent of the SYSCALL_MASK MSR, isn't it?
> 
> Well, is it?
> 
> It definitely never existed in 32bit builds of Xen, when the int80
> direct trap existed.
> 
> I don't see any evidence of it applying anywhere for compat PV guests
> either, even those with syscall enabled.

Neither int80 nor the 32-bit mode syscall apply the flags mask. Therefore
why would we mimic such behavior in Xen?

>> With it not
>> really being an (emulated) MSR, but an agreement that guests will only ever
>> care about having DF cleared (besides the requested way of dealing with the
>> event mask). One of the many things not written down anywhere ...
> 
> If it's not written down, it doesn't exist...

IOW the PV interface as a whole largely doesn't exist.

> And even if this is supposed to be a PV-FMASK-ish thing, it's buggy
> because it apples also when #UD is raised for no registered callback. 
> (And yes, I realise there is a chronology issues here (the #UD check is
> the newer element), but it really will corrupt state as presented in a
> SIGSEGV frame.
> 
> The question we need to answer is whether there is any remotely-credible
> way that a PV guest kernel author could be expecting this behaviour and
> relying on it.
> 
> I honestly don't think there is.

I was going to say XenoLinux and its forward ports did, but upstream
Linux does, too. The native and PV paths being largely shared, there's
no CLD in sight anywhere.

> It fails the principle of least surprise compared to native behaviour,
> 32bit PV behaviour, and to every non-SYSCALL based 64bit event also.

Why "every"? switch_to_kernel is used solely for syscall handling.

And why do you consider this behavior surprising, when really it would
be surprising for the flag to remain untouched from what user mode had,
when taking into account how the MSR is set in the native case?

> It either needs writing down somewhere (and the #UD case fixing), or it
> needs deleing, because continuing to leave it in this state isn't ok.

I can try to determine whether addressing the #UD case can be done in a
non-intrusive way. If so, I can split off that part of the change here,
and make it a separate one with the MOVL->MOVB just done on the side.
But really this is scope creep: The change as is has no (intended)
change in behavior. It ought to be fine to go in independently; the
further work you ask for could be done up front or later on.

Jan