[v1] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

[PATCH for-4.14] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

Posted by Andrew Cooper 5 years, 8 months ago

Just like the alternatives infrastructure, the livepatch infrastructure
disables CR0.WP to perform patching, which is not permitted with CET active.

Modify arch_livepatch_{quiesce,revive}() to disable CET before disabling WP,
and reset the dirty bits on all virtual regions before re-enabling CET.  One
complication is that arch_livepatch_revive() has to fix up the top of the
shadow stack.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Pawel Wieczorkiewicz <wipawel@amazon.de>
CC: Paul Durrant <paul@xen.org>

For 4.14.  This is a bug in a 4.14 feature, with a very low risk to non-CET
usecases.

A better longterm plan (but definitely 4.15 material now) would be to create a
separate set of writeable mappings to perform the patching on, at which point
we don't need to disable CET, play with WP, or retroactively clear dirty bits.

Do we ever write into .rodata?  AFAICT, we introduce new fuctions for
references to new .rodata, rather than modifying existing .rodata.  If however
we do modify .rodata, then the virtual regions need extending with information
about .rodata so we can zap those dirty bits as well.
---
 xen/arch/x86/livepatch.c         | 28 ++++++++++++++++++++++++++++
 xen/common/virtual_region.c      | 13 +++++++++++++
 xen/include/xen/virtual_region.h |  1 +
 3 files changed, 42 insertions(+)

diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
index 901fad96bf..10231a4e40 100644
--- a/xen/arch/x86/livepatch.c
+++ b/xen/arch/x86/livepatch.c
@@ -12,6 +12,7 @@
 #include <xen/livepatch.h>
 #include <xen/sched.h>
 #include <xen/vm_event.h>
+#include <xen/virtual_region.h>
 
 #include <asm/fixmap.h>
 #include <asm/nmi.h>
@@ -58,6 +59,10 @@ int arch_livepatch_safety_check(void)
 
 int arch_livepatch_quiesce(void)
 {
+    /* If Shadow Stacks are in use, disable CR4.CET so we can modify CR0.WP. */
+    if ( cpu_has_xen_shstk )
+        write_cr4(read_cr4() & ~X86_CR4_CET);
+
     /* Disable WP to allow changes to read-only pages. */
     write_cr0(read_cr0() & ~X86_CR0_WP);
 
@@ -68,6 +73,29 @@ void arch_livepatch_revive(void)
 {
     /* Reinstate WP. */
     write_cr0(read_cr0() | X86_CR0_WP);
+
+    /* Clobber dirty bits and reinstate CET, if applicable. */
+    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
+    {
+        unsigned long tmp;
+
+        reset_virtual_region_perms();
+
+        write_cr4(read_cr4() | X86_CR4_CET);
+
+        /*
+         * Fix up the return address on the shadow stack, which currently
+         * points at arch_livepatch_quiesce()'s caller.
+         *
+         * Note: this is somewhat fragile, and depends on both
+         * arch_livepatch_{quiesce,revive}() being called from the same
+         * function, which is currently the case.
+         */
+        asm volatile ("rdsspq %[ssp];"
+                      "wrssq %[addr], (%[ssp]);"
+                      : [ssp] "=&r" (tmp)
+                      : [addr] "r" (__builtin_return_address(0)));
+    }
 }
 
 int arch_livepatch_verify_func(const struct livepatch_func *func)
diff --git a/xen/common/virtual_region.c b/xen/common/virtual_region.c
index aa23918bce..84d993d8f8 100644
--- a/xen/common/virtual_region.c
+++ b/xen/common/virtual_region.c
@@ -4,6 +4,7 @@
 
 #include <xen/init.h>
 #include <xen/kernel.h>
+#include <xen/mm.h>
 #include <xen/rcupdate.h>
 #include <xen/spinlock.h>
 #include <xen/virtual_region.h>
@@ -91,6 +92,18 @@ void unregister_virtual_region(struct virtual_region *r)
     remove_virtual_region(r);
 }
 
+void reset_virtual_region_perms(void)
+{
+    const struct virtual_region *region;
+
+    rcu_read_lock(&rcu_virtual_region_lock);
+    list_for_each_entry_rcu( region, &virtual_region_list, list )
+        modify_xen_mappings((unsigned long)region->start,
+                            ROUNDUP((unsigned long)region->end, PAGE_SIZE),
+                            PAGE_HYPERVISOR_RX);
+    rcu_read_unlock(&rcu_virtual_region_lock);
+}
+
 void __init unregister_init_virtual_region(void)
 {
     BUG_ON(system_state != SYS_STATE_active);
diff --git a/xen/include/xen/virtual_region.h b/xen/include/xen/virtual_region.h
index e5e58ed96b..ba408eb87a 100644
--- a/xen/include/xen/virtual_region.h
+++ b/xen/include/xen/virtual_region.h
@@ -33,6 +33,7 @@ void setup_virtual_regions(const struct exception_table_entry *start,
 void unregister_init_virtual_region(void);
 void register_virtual_region(struct virtual_region *r);
 void unregister_virtual_region(struct virtual_region *r);
+void reset_virtual_region_perms(void);
 
 #endif /* __XEN_VIRTUAL_REGION_H__ */
 
-- 
2.11.0

Re: [PATCH for-4.14] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

Posted by Jan Beulich 5 years, 8 months ago

On 08.06.2020 22:02, Andrew Cooper wrote:
> Do we ever write into .rodata?  AFAICT, we introduce new fuctions for
> references to new .rodata, rather than modifying existing .rodata.  If however
> we do modify .rodata, then the virtual regions need extending with information
> about .rodata so we can zap those dirty bits as well.

Inspired by looking at setup_virtual_regions(), do exception fixup
or bug frame tables possibly get patched?

> @@ -58,6 +59,10 @@ int arch_livepatch_safety_check(void)
>  
>  int arch_livepatch_quiesce(void)
>  {
> +    /* If Shadow Stacks are in use, disable CR4.CET so we can modify CR0.WP. */
> +    if ( cpu_has_xen_shstk )
> +        write_cr4(read_cr4() & ~X86_CR4_CET);
> +
>      /* Disable WP to allow changes to read-only pages. */
>      write_cr0(read_cr0() & ~X86_CR0_WP);
>  
> @@ -68,6 +73,29 @@ void arch_livepatch_revive(void)
>  {
>      /* Reinstate WP. */
>      write_cr0(read_cr0() | X86_CR0_WP);
> +
> +    /* Clobber dirty bits and reinstate CET, if applicable. */
> +    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
> +    {
> +        unsigned long tmp;
> +
> +        reset_virtual_region_perms();
> +
> +        write_cr4(read_cr4() | X86_CR4_CET);
> +
> +        /*
> +         * Fix up the return address on the shadow stack, which currently
> +         * points at arch_livepatch_quiesce()'s caller.
> +         *
> +         * Note: this is somewhat fragile, and depends on both
> +         * arch_livepatch_{quiesce,revive}() being called from the same
> +         * function, which is currently the case.
> +         */
> +        asm volatile ("rdsspq %[ssp];"
> +                      "wrssq %[addr], (%[ssp]);"
> +                      : [ssp] "=&r" (tmp)
> +                      : [addr] "r" (__builtin_return_address(0)));
> +    }

To be safe against LTO I think this wants accompanying with making
both functions noinline.

As to the fragility - how about arch_livepatch_quiesce() returning
__builtin_return_address(0) to its caller, for passing into here
for verification? This would also make more noticeable that the
two should be be called from the same function (or else the "token"
would need passing further around).

> @@ -91,6 +92,18 @@ void unregister_virtual_region(struct virtual_region *r)
>      remove_virtual_region(r);
>  }
>  
> +void reset_virtual_region_perms(void)
> +{
> +    const struct virtual_region *region;
> +
> +    rcu_read_lock(&rcu_virtual_region_lock);
> +    list_for_each_entry_rcu( region, &virtual_region_list, list )
> +        modify_xen_mappings((unsigned long)region->start,
> +                            ROUNDUP((unsigned long)region->end, PAGE_SIZE),
> +                            PAGE_HYPERVISOR_RX);
> +    rcu_read_unlock(&rcu_virtual_region_lock);
> +}

Doesn't this result in shattering the trailing (and currently still
only) 2Mb page mapping .text in the xen.efi case? With the
expectation of the approach changing in 4.15 this may not need
addressing, but should imo be mentioned as a shortcoming in the
description then.

Also - how about "restore" instead of "reset"?

Finally, while indeed not a lot of code, should it nevertheless go
inside #ifdef CONFIG_LIVEPATCH?

Jan

Re: [PATCH for-4.14] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

Posted by Andrew Cooper 5 years, 8 months ago

On 09/06/2020 14:41, Jan Beulich wrote:
> On 08.06.2020 22:02, Andrew Cooper wrote:
>> Do we ever write into .rodata?  AFAICT, we introduce new fuctions for
>> references to new .rodata, rather than modifying existing .rodata.  If however
>> we do modify .rodata, then the virtual regions need extending with information
>> about .rodata so we can zap those dirty bits as well.
> Inspired by looking at setup_virtual_regions(), do exception fixup
> or bug frame tables possibly get patched?

If they're not in .rodata, they really ought to be.

That said, neither of those tables should get touched - we add new
subset tables in the livepatch, which covers anything arising from
modified code.  This means we don't merge/resort the table on load, or
edit the table at all on unload.

>
>> @@ -58,6 +59,10 @@ int arch_livepatch_safety_check(void)
>>  
>>  int arch_livepatch_quiesce(void)
>>  {
>> +    /* If Shadow Stacks are in use, disable CR4.CET so we can modify CR0.WP. */
>> +    if ( cpu_has_xen_shstk )
>> +        write_cr4(read_cr4() & ~X86_CR4_CET);
>> +
>>      /* Disable WP to allow changes to read-only pages. */
>>      write_cr0(read_cr0() & ~X86_CR0_WP);
>>  
>> @@ -68,6 +73,29 @@ void arch_livepatch_revive(void)
>>  {
>>      /* Reinstate WP. */
>>      write_cr0(read_cr0() | X86_CR0_WP);
>> +
>> +    /* Clobber dirty bits and reinstate CET, if applicable. */
>> +    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
>> +    {
>> +        unsigned long tmp;
>> +
>> +        reset_virtual_region_perms();
>> +
>> +        write_cr4(read_cr4() | X86_CR4_CET);
>> +
>> +        /*
>> +         * Fix up the return address on the shadow stack, which currently
>> +         * points at arch_livepatch_quiesce()'s caller.
>> +         *
>> +         * Note: this is somewhat fragile, and depends on both
>> +         * arch_livepatch_{quiesce,revive}() being called from the same
>> +         * function, which is currently the case.
>> +         */
>> +        asm volatile ("rdsspq %[ssp];"
>> +                      "wrssq %[addr], (%[ssp]);"
>> +                      : [ssp] "=&r" (tmp)
>> +                      : [addr] "r" (__builtin_return_address(0)));
>> +    }
> To be safe against LTO I think this wants accompanying with making
> both functions noinline.

Hmm true.

> As to the fragility - how about arch_livepatch_quiesce() returning
> __builtin_return_address(0) to its caller, for passing into here
> for verification? This would also make more noticeable that the
> two should be be called from the same function (or else the "token"
> would need passing further around).

This I'm less certain about, as its fairly invasive to common code, just
to bodge around something I don't expect to last very long into the 4.15
timeframe.

>
>> @@ -91,6 +92,18 @@ void unregister_virtual_region(struct virtual_region *r)
>>      remove_virtual_region(r);
>>  }
>>  
>> +void reset_virtual_region_perms(void)
>> +{
>> +    const struct virtual_region *region;
>> +
>> +    rcu_read_lock(&rcu_virtual_region_lock);
>> +    list_for_each_entry_rcu( region, &virtual_region_list, list )
>> +        modify_xen_mappings((unsigned long)region->start,
>> +                            ROUNDUP((unsigned long)region->end, PAGE_SIZE),
>> +                            PAGE_HYPERVISOR_RX);
>> +    rcu_read_unlock(&rcu_virtual_region_lock);
>> +}
> Doesn't this result in shattering the trailing (and currently still
> only) 2Mb page mapping .text in the xen.efi case?

Not any more or less than its already clobbered by this logic in the
alternatives path, I think?

>  With the
> expectation of the approach changing in 4.15 this may not need
> addressing, but should imo be mentioned as a shortcoming in the
> description then.
>
> Also - how about "restore" instead of "reset"?

Why?  We're not passing some state sideways into the new mappings -
we're resetting them to their expected values.

>
> Finally, while indeed not a lot of code, should it nevertheless go
> inside #ifdef CONFIG_LIVEPATCH?

Tbh, it could be CONFIG_XEN_SHSTK if we decide to go down that route.

~Andrew

Re: [PATCH for-4.14] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

Posted by Jan Beulich 5 years, 7 months ago

On 10.06.2020 16:39, Andrew Cooper wrote:
> On 09/06/2020 14:41, Jan Beulich wrote:
>> On 08.06.2020 22:02, Andrew Cooper wrote:
>>> Do we ever write into .rodata?  AFAICT, we introduce new fuctions for
>>> references to new .rodata, rather than modifying existing .rodata.  If however
>>> we do modify .rodata, then the virtual regions need extending with information
>>> about .rodata so we can zap those dirty bits as well.
>> Inspired by looking at setup_virtual_regions(), do exception fixup
>> or bug frame tables possibly get patched?
> 
> If they're not in .rodata, they really ought to be.

They are, afaict, and hence my question.

> That said, neither of those tables should get touched - we add new
> subset tables in the livepatch, which covers anything arising from
> modified code.  This means we don't merge/resort the table on load, or
> edit the table at all on unload.

I guessed it ought to be like that, but thought I'd better ask.

>>> @@ -58,6 +59,10 @@ int arch_livepatch_safety_check(void)
>>>  
>>>  int arch_livepatch_quiesce(void)
>>>  {
>>> +    /* If Shadow Stacks are in use, disable CR4.CET so we can modify CR0.WP. */
>>> +    if ( cpu_has_xen_shstk )
>>> +        write_cr4(read_cr4() & ~X86_CR4_CET);
>>> +
>>>      /* Disable WP to allow changes to read-only pages. */
>>>      write_cr0(read_cr0() & ~X86_CR0_WP);
>>>  
>>> @@ -68,6 +73,29 @@ void arch_livepatch_revive(void)
>>>  {
>>>      /* Reinstate WP. */
>>>      write_cr0(read_cr0() | X86_CR0_WP);
>>> +
>>> +    /* Clobber dirty bits and reinstate CET, if applicable. */
>>> +    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
>>> +    {
>>> +        unsigned long tmp;
>>> +
>>> +        reset_virtual_region_perms();
>>> +
>>> +        write_cr4(read_cr4() | X86_CR4_CET);
>>> +
>>> +        /*
>>> +         * Fix up the return address on the shadow stack, which currently
>>> +         * points at arch_livepatch_quiesce()'s caller.
>>> +         *
>>> +         * Note: this is somewhat fragile, and depends on both
>>> +         * arch_livepatch_{quiesce,revive}() being called from the same
>>> +         * function, which is currently the case.
>>> +         */
>>> +        asm volatile ("rdsspq %[ssp];"
>>> +                      "wrssq %[addr], (%[ssp]);"
>>> +                      : [ssp] "=&r" (tmp)
>>> +                      : [addr] "r" (__builtin_return_address(0)));
>>> +    }
>> To be safe against LTO I think this wants accompanying with making
>> both functions noinline.
> 
> Hmm true.
> 
>> As to the fragility - how about arch_livepatch_quiesce() returning
>> __builtin_return_address(0) to its caller, for passing into here
>> for verification? This would also make more noticeable that the
>> two should be be called from the same function (or else the "token"
>> would need passing further around).
> 
> This I'm less certain about, as its fairly invasive to common code, just
> to bodge around something I don't expect to last very long into the 4.15
> timeframe.

I don't see it  being invasive - there's a new local variable needed
in both of apply_payload() and revert_payload(), and of course the
call sites would need adjustment.

>>> @@ -91,6 +92,18 @@ void unregister_virtual_region(struct virtual_region *r)
>>>      remove_virtual_region(r);
>>>  }
>>>  
>>> +void reset_virtual_region_perms(void)
>>> +{
>>> +    const struct virtual_region *region;
>>> +
>>> +    rcu_read_lock(&rcu_virtual_region_lock);
>>> +    list_for_each_entry_rcu( region, &virtual_region_list, list )
>>> +        modify_xen_mappings((unsigned long)region->start,
>>> +                            ROUNDUP((unsigned long)region->end, PAGE_SIZE),
>>> +                            PAGE_HYPERVISOR_RX);
>>> +    rcu_read_unlock(&rcu_virtual_region_lock);
>>> +}
>> Doesn't this result in shattering the trailing (and currently still
>> only) 2Mb page mapping .text in the xen.efi case?
> 
> Not any more or less than its already clobbered by this logic in the
> alternatives path, I think?

Not afaict, there we have

    if ( cpu_has_xen_shstk )
        modify_xen_mappings(XEN_VIRT_START + MB(2),
                            (unsigned long)&__2M_text_end,
                            PAGE_HYPERVISOR_RX);

>>  With the
>> expectation of the approach changing in 4.15 this may not need
>> addressing, but should imo be mentioned as a shortcoming in the
>> description then.
>>
>> Also - how about "restore" instead of "reset"?
> 
> Why?  We're not passing some state sideways into the new mappings -
> we're resetting them to their expected values.

To me as a non-native speaker "reset" means more like some initial
state, whereas "restore" means more like "to some intended state".

>> Finally, while indeed not a lot of code, should it nevertheless go
>> inside #ifdef CONFIG_LIVEPATCH?
> 
> Tbh, it could be CONFIG_XEN_SHSTK if we decide to go down that route.

The the && of both, really.

Jan

Re: [PATCH for-4.14] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

Posted by Andrew Cooper 5 years, 7 months ago

On 15/06/2020 16:16, Jan Beulich wrote:
>>>> @@ -58,6 +59,10 @@ int arch_livepatch_safety_check(void)
>>>>  
>>>>  int arch_livepatch_quiesce(void)
>>>>  {
>>>> +    /* If Shadow Stacks are in use, disable CR4.CET so we can modify CR0.WP. */
>>>> +    if ( cpu_has_xen_shstk )
>>>> +        write_cr4(read_cr4() & ~X86_CR4_CET);
>>>> +
>>>>      /* Disable WP to allow changes to read-only pages. */
>>>>      write_cr0(read_cr0() & ~X86_CR0_WP);
>>>>  
>>>> @@ -68,6 +73,29 @@ void arch_livepatch_revive(void)
>>>>  {
>>>>      /* Reinstate WP. */
>>>>      write_cr0(read_cr0() | X86_CR0_WP);
>>>> +
>>>> +    /* Clobber dirty bits and reinstate CET, if applicable. */
>>>> +    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
>>>> +    {
>>>> +        unsigned long tmp;
>>>> +
>>>> +        reset_virtual_region_perms();
>>>> +
>>>> +        write_cr4(read_cr4() | X86_CR4_CET);
>>>> +
>>>> +        /*
>>>> +         * Fix up the return address on the shadow stack, which currently
>>>> +         * points at arch_livepatch_quiesce()'s caller.
>>>> +         *
>>>> +         * Note: this is somewhat fragile, and depends on both
>>>> +         * arch_livepatch_{quiesce,revive}() being called from the same
>>>> +         * function, which is currently the case.
>>>> +         */
>>>> +        asm volatile ("rdsspq %[ssp];"
>>>> +                      "wrssq %[addr], (%[ssp]);"
>>>> +                      : [ssp] "=&r" (tmp)
>>>> +                      : [addr] "r" (__builtin_return_address(0)));
>>>> +    }
>>> To be safe against LTO I think this wants accompanying with making
>>> both functions noinline.
>> Hmm true.
>>
>>> As to the fragility - how about arch_livepatch_quiesce() returning
>>> __builtin_return_address(0) to its caller, for passing into here
>>> for verification? This would also make more noticeable that the
>>> two should be be called from the same function (or else the "token"
>>> would need passing further around).
>> This I'm less certain about, as its fairly invasive to common code, just
>> to bodge around something I don't expect to last very long into the 4.15
>> timeframe.
> I don't see it  being invasive - there's a new local variable needed
> in both of apply_payload() and revert_payload(), and of course the
> call sites would need adjustment.

Exactly - the call site adjustment is what makes it invasive in common
code, and all other architectures.

Any getting this wrong will die with #CP[near ret] anyway.

The only thing passing that value around would do is let you tweak the
error message slightly before we crash out.

>>>> @@ -91,6 +92,18 @@ void unregister_virtual_region(struct virtual_region *r)
>>>>      remove_virtual_region(r);
>>>>  }
>>>>  
>>>> +void reset_virtual_region_perms(void)
>>>> +{
>>>> +    const struct virtual_region *region;
>>>> +
>>>> +    rcu_read_lock(&rcu_virtual_region_lock);
>>>> +    list_for_each_entry_rcu( region, &virtual_region_list, list )
>>>> +        modify_xen_mappings((unsigned long)region->start,
>>>> +                            ROUNDUP((unsigned long)region->end, PAGE_SIZE),
>>>> +                            PAGE_HYPERVISOR_RX);
>>>> +    rcu_read_unlock(&rcu_virtual_region_lock);
>>>> +}
>>> Doesn't this result in shattering the trailing (and currently still
>>> only) 2Mb page mapping .text in the xen.efi case?
>> Not any more or less than its already clobbered by this logic in the
>> alternatives path, I think?
> Not afaict, there we have
>
>     if ( cpu_has_xen_shstk )
>         modify_xen_mappings(XEN_VIRT_START + MB(2),
>                             (unsigned long)&__2M_text_end,
>                             PAGE_HYPERVISOR_RX);

Hmm ok.  I'll make a note.

>>>  With the
>>> expectation of the approach changing in 4.15 this may not need
>>> addressing, but should imo be mentioned as a shortcoming in the
>>> description then.
>>>
>>> Also - how about "restore" instead of "reset"?
>> Why?  We're not passing some state sideways into the new mappings -
>> we're resetting them to their expected values.
> To me as a non-native speaker "reset" means more like some initial
> state, whereas "restore" means more like "to some intended state".

I feel that is a very subjective interpretation, but even if we go with
it, the fact the function is void distinguishes the two.

~Andrew

Re: [PATCH for-4.14] x86/livepatch: Make livepatching compatible with CET Shadow Stacks

Posted by Jan Beulich 5 years, 7 months ago

On 26.06.2020 13:56, Andrew Cooper wrote:
> On 15/06/2020 16:16, Jan Beulich wrote:
>>>>> @@ -58,6 +59,10 @@ int arch_livepatch_safety_check(void)
>>>>>  
>>>>>  int arch_livepatch_quiesce(void)
>>>>>  {
>>>>> +    /* If Shadow Stacks are in use, disable CR4.CET so we can modify CR0.WP. */
>>>>> +    if ( cpu_has_xen_shstk )
>>>>> +        write_cr4(read_cr4() & ~X86_CR4_CET);
>>>>> +
>>>>>      /* Disable WP to allow changes to read-only pages. */
>>>>>      write_cr0(read_cr0() & ~X86_CR0_WP);
>>>>>  
>>>>> @@ -68,6 +73,29 @@ void arch_livepatch_revive(void)
>>>>>  {
>>>>>      /* Reinstate WP. */
>>>>>      write_cr0(read_cr0() | X86_CR0_WP);
>>>>> +
>>>>> +    /* Clobber dirty bits and reinstate CET, if applicable. */
>>>>> +    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
>>>>> +    {
>>>>> +        unsigned long tmp;
>>>>> +
>>>>> +        reset_virtual_region_perms();
>>>>> +
>>>>> +        write_cr4(read_cr4() | X86_CR4_CET);
>>>>> +
>>>>> +        /*
>>>>> +         * Fix up the return address on the shadow stack, which currently
>>>>> +         * points at arch_livepatch_quiesce()'s caller.
>>>>> +         *
>>>>> +         * Note: this is somewhat fragile, and depends on both
>>>>> +         * arch_livepatch_{quiesce,revive}() being called from the same
>>>>> +         * function, which is currently the case.
>>>>> +         */
>>>>> +        asm volatile ("rdsspq %[ssp];"
>>>>> +                      "wrssq %[addr], (%[ssp]);"
>>>>> +                      : [ssp] "=&r" (tmp)
>>>>> +                      : [addr] "r" (__builtin_return_address(0)));
>>>>> +    }
>>>> To be safe against LTO I think this wants accompanying with making
>>>> both functions noinline.
>>> Hmm true.
>>>
>>>> As to the fragility - how about arch_livepatch_quiesce() returning
>>>> __builtin_return_address(0) to its caller, for passing into here
>>>> for verification? This would also make more noticeable that the
>>>> two should be be called from the same function (or else the "token"
>>>> would need passing further around).
>>> This I'm less certain about, as its fairly invasive to common code, just
>>> to bodge around something I don't expect to last very long into the 4.15
>>> timeframe.
>> I don't see it  being invasive - there's a new local variable needed
>> in both of apply_payload() and revert_payload(), and of course the
>> call sites would need adjustment.
> 
> Exactly - the call site adjustment is what makes it invasive in common
> code, and all other architectures.
> 
> Any getting this wrong will die with #CP[near ret] anyway.
> 
> The only thing passing that value around would do is let you tweak the
> error message slightly before we crash out.

Well, okay - I'm not a maintainer of that part of the code anyway.

Jan