[PATCH v2 12/14] x86/entry: Adjust guest paths to be shadow stack compatible

Andrew Cooper posted 14 patches 5 years, 8 months ago
[PATCH v2 12/14] x86/entry: Adjust guest paths to be shadow stack compatible
Posted by Andrew Cooper 5 years, 8 months ago
The SYSCALL/SYSENTER/SYSRET paths need to use {SET,CLR}SSBSY.  The IRET to
guest paths must not.  In the SYSRET path, re-position the mov which loads rip
into %rcx so we can use %rcx for CLRSSBSY, rather than spilling another
register to the stack.

While we can in principle detect shadow stack corruption and a failure to
clear the supervisor token busy bit in the SYSRET path (by inspecting the
carry flag following CLRSSBSY), we cannot detect similar problems for the IRET
path (IRET is specified not to fault in this case).

We will double fault at some point later, when next trying to enter Xen, due
to an already-set supervisor shadow stack busy bit.  As SYSRET is a uncommon
path anyway, avoid the added complexity for no appreciable gain.

The IST switch onto the primary stack is not great as we have an instruction
boundary with no shadow stack.  This is the least bad option available.

These paths are not used before shadow stacks are properly established, so can
use alternatives to avoid extra runtime CET detection logic.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>

v2:
 * Break comment deletion out to an earlier patch
 * SETSSBSY on the SYSENTER path as well
 * Don't spill %rax to the stack in the SYSRET path
---
 xen/arch/x86/x86_64/compat/entry.S |  1 +
 xen/arch/x86/x86_64/entry.S        | 32 +++++++++++++++++++++++++++++++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index 3cd375bd48..2ca81341a4 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -198,6 +198,7 @@ ENTRY(cr4_pv32_restore)
 
 /* See lstar_enter for entry register state. */
 ENTRY(cstar_enter)
+        ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK
         /* sti could live here when we don't switch page tables below. */
         CR4_PV32_RESTORE
         movq  8(%rsp),%rax /* Restore %rax. */
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 78ac0df49f..449ee468e4 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -191,9 +191,16 @@ restore_all_guest:
         sarq  $47,%rcx
         incl  %ecx
         cmpl  $1,%ecx
-        movq  8(%rsp),%rcx            # RIP
         ja    iret_exit_to_guest
 
+        /* Clear the supervisor shadow stack token busy bit. */
+.macro rag_clrssbsy
+        rdsspq %rcx
+        clrssbsy (%rcx)
+.endm
+        ALTERNATIVE "", rag_clrssbsy, X86_FEATURE_XEN_SHSTK
+
+        movq  8(%rsp), %rcx           # RIP
         cmpw  $FLAT_USER_CS32,16(%rsp)# CS
         movq  32(%rsp),%rsp           # RSP
         je    1f
@@ -226,6 +233,7 @@ iret_exit_to_guest:
  * %ss must be saved into the space left by the trampoline.
  */
 ENTRY(lstar_enter)
+        ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK
         /* sti could live here when we don't switch page tables below. */
         movq  8(%rsp),%rax /* Restore %rax. */
         movq  $FLAT_KERNEL_SS,8(%rsp)
@@ -259,6 +267,7 @@ ENTRY(lstar_enter)
         jmp   test_all_events
 
 ENTRY(sysenter_entry)
+        ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK
         /* sti could live here when we don't switch page tables below. */
         pushq $FLAT_USER_SS
         pushq $0
@@ -877,6 +886,27 @@ handle_ist_exception:
         movl  $UREGS_kernel_sizeof/8,%ecx
         movq  %rdi,%rsp
         rep   movsq
+
+        /* Switch Shadow Stacks */
+.macro ist_switch_shstk
+        rdsspq %rdi
+        clrssbsy (%rdi)
+        /*
+         * Switching supervisor shadow stacks is specially hard, as supervisor
+         * and restore tokens are incompatible.
+         *
+         * For now, we only need to switch on to an unused primary shadow
+         * stack, so use SETSSBSY for the purpose, exactly like the
+         * SYSCALL/SYSENTER entry.
+         *
+         * Ideally, we'd want to CLRSSBSY after switching stacks, but that
+         * will leave SSP zeroed so it not an option.  Instead, we transiently
+         * have a zero SSP on this instruction boundary, and depend on IST for
+         * NMI/#MC protection.
+         */
+        setssbsy
+.endm
+        ALTERNATIVE "", ist_switch_shstk, X86_FEATURE_XEN_SHSTK
 1:
 #else
         ASSERT_CONTEXT_IS_XEN
-- 
2.11.0


Re: [PATCH v2 12/14] x86/entry: Adjust guest paths to be shadow stack compatible
Posted by Jan Beulich 5 years, 8 months ago
On 27.05.2020 21:18, Andrew Cooper wrote:
> The SYSCALL/SYSENTER/SYSRET paths need to use {SET,CLR}SSBSY.  The IRET to
> guest paths must not.  In the SYSRET path, re-position the mov which loads rip
> into %rcx so we can use %rcx for CLRSSBSY, rather than spilling another
> register to the stack.
> 
> While we can in principle detect shadow stack corruption and a failure to
> clear the supervisor token busy bit in the SYSRET path (by inspecting the
> carry flag following CLRSSBSY), we cannot detect similar problems for the IRET
> path (IRET is specified not to fault in this case).
> 
> We will double fault at some point later, when next trying to enter Xen, due
> to an already-set supervisor shadow stack busy bit.  As SYSRET is a uncommon
> path anyway, avoid the added complexity for no appreciable gain.

I'm okay with the avoidance of complexity, but why is the SYSRET path
uncommon? Almost all hypercall returns ought to take that path?

> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -191,9 +191,16 @@ restore_all_guest:
>          sarq  $47,%rcx
>          incl  %ecx
>          cmpl  $1,%ecx
> -        movq  8(%rsp),%rcx            # RIP
>          ja    iret_exit_to_guest

This removal from the shared part of the exit path needs to be
reflected on both of the sides of the JA, i.e. ...

>  
> +        /* Clear the supervisor shadow stack token busy bit. */
> +.macro rag_clrssbsy
> +        rdsspq %rcx
> +        clrssbsy (%rcx)
> +.endm
> +        ALTERNATIVE "", rag_clrssbsy, X86_FEATURE_XEN_SHSTK
> +
> +        movq  8(%rsp), %rcx           # RIP

... not just here, but also like this (with the JA above changed
to target the new label):

         ALIGN
 /* No special register assumptions. */
+.Liret_exit_to_guest:
+        movq  8(%rsp),%rcx            # RIP
 iret_exit_to_guest:
         andl  $~(X86_EFLAGS_IOPL|X86_EFLAGS_NT|X86_EFLAGS_VM),24(%rsp)
         orl   $X86_EFLAGS_IF,24(%rsp)

Granted it's mostly cosmetic, as the IRETQ ought to fault, but
it's still a use of IRET in place of SYSRET, and hence we better
get guest register state right. With this or a functionally
identical adjustment (or a clarification on what makes you
convinced this adjustment isn't needed)
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan

Re: [PATCH v2 12/14] x86/entry: Adjust guest paths to be shadow stack compatible
Posted by Andrew Cooper 5 years, 8 months ago
On 29/05/2020 13:40, Jan Beulich wrote:
> On 27.05.2020 21:18, Andrew Cooper wrote:
>> The SYSCALL/SYSENTER/SYSRET paths need to use {SET,CLR}SSBSY.  The IRET to
>> guest paths must not.  In the SYSRET path, re-position the mov which loads rip
>> into %rcx so we can use %rcx for CLRSSBSY, rather than spilling another
>> register to the stack.
>>
>> While we can in principle detect shadow stack corruption and a failure to
>> clear the supervisor token busy bit in the SYSRET path (by inspecting the
>> carry flag following CLRSSBSY), we cannot detect similar problems for the IRET
>> path (IRET is specified not to fault in this case).
>>
>> We will double fault at some point later, when next trying to enter Xen, due
>> to an already-set supervisor shadow stack busy bit.  As SYSRET is a uncommon
>> path anyway, avoid the added complexity for no appreciable gain.
> I'm okay with the avoidance of complexity, but why is the SYSRET path
> uncommon? Almost all hypercall returns ought to take that path?

But hypercalls returns aren't the majority of returns to guest context.

IRET from Xen IPIs, or from event channel injections hitting guest
userspace, are the most common in a non-idle system.

>
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -191,9 +191,16 @@ restore_all_guest:
>>          sarq  $47,%rcx
>>          incl  %ecx
>>          cmpl  $1,%ecx
>> -        movq  8(%rsp),%rcx            # RIP
>>          ja    iret_exit_to_guest
> This removal from the shared part of the exit path needs to be
> reflected on both of the sides of the JA, i.e. ...
>
>>  
>> +        /* Clear the supervisor shadow stack token busy bit. */
>> +.macro rag_clrssbsy
>> +        rdsspq %rcx
>> +        clrssbsy (%rcx)
>> +.endm
>> +        ALTERNATIVE "", rag_clrssbsy, X86_FEATURE_XEN_SHSTK
>> +
>> +        movq  8(%rsp), %rcx           # RIP
> ... not just here, but also like this (with the JA above changed
> to target the new label):
>
>          ALIGN
>  /* No special register assumptions. */
> +.Liret_exit_to_guest:
> +        movq  8(%rsp),%rcx            # RIP
>  iret_exit_to_guest:
>          andl  $~(X86_EFLAGS_IOPL|X86_EFLAGS_NT|X86_EFLAGS_VM),24(%rsp)
>          orl   $X86_EFLAGS_IF,24(%rsp)
>
> Granted it's mostly cosmetic, as the IRETQ ought to fault, but
> it's still a use of IRET in place of SYSRET, and hence we better
> get guest register state right. With this or a functionally
> identical adjustment (or a clarification on what makes you
> convinced this adjustment isn't needed)
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Ah yes.  I really ought to retroactively create an XSA-7 PoC for this.

Will fix.

~Andrew