[PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled

Bill Wendling posted 2 patches 3 years, 7 months ago
[PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Bill Wendling 3 years, 7 months ago
The ZERO_CALL_USED_REGS feature may zero out caller-saved registers
before returning.

In spurious_kernel_fault(), the "pte_offset_kernel()" call results in
this assembly code:

.Ltmp151:
        #APP
        # ALT: oldnstr
.Ltmp152:
.Ltmp153:
.Ltmp154:
        .section        .discard.retpoline_safe,"",@progbits
        .quad   .Ltmp154
        .text

        callq   *pv_ops+536(%rip)

.Ltmp155:
        .section        .parainstructions,"a",@progbits
        .p2align        3, 0x0
        .quad   .Ltmp153
        .byte   67
        .byte   .Ltmp155-.Ltmp153
        .short  1
        .text
.Ltmp156:
        # ALT: padding
        .zero   (-(((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152))>0))*((.Ltmp157-.Ltmp158)-(.Ltmp156-.Ltmp152)),144
.Ltmp159:
        .section        .altinstructions,"a",@progbits
.Ltmp160:
        .long   .Ltmp152-.Ltmp160
.Ltmp161:
        .long   .Ltmp158-.Ltmp161
        .short  33040
        .byte   .Ltmp159-.Ltmp152
        .byte   .Ltmp157-.Ltmp158
        .text

        .section        .altinstr_replacement,"ax",@progbits
        # ALT: replacement 1
.Ltmp158:
        movq    %rdi, %rax
.Ltmp157:
        .text
        #NO_APP
.Ltmp162:
        testb   $-128, %dil

The "testb" here is using %dil, but the %rdi register was cleared before
returning from "callq *pv_ops+536(%rip)". Adding the proper constraints
results in the use of a different register:

        movq    %r11, %rdi

        # Similar to above.

        testb   $-128, %r11b

Link: https://github.com/KSPP/linux/issues/192
Signed-off-by: Bill Wendling <morbo@google.com>
Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org>
---
 arch/x86/include/asm/paravirt_types.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index f04157456a49..b1ab5d94881b 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
 				"=c" (__ecx)
 #define PVOP_CALL_CLOBBERS	PVOP_VCALL_CLOBBERS, "=a" (__eax)
 
-/* void functions are still allowed [re]ax for scratch */
+/*
+ * void functions are still allowed [re]ax for scratch.
+ *
+ * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
+ * registers. Make sure we model this with the appropriate clobbers.
+ */
+#ifdef CONFIG_ZERO_CALL_USED_REGS
+#define PVOP_VCALLEE_CLOBBERS	"=a" (__eax), PVOP_VCALL_CLOBBERS
+#else
 #define PVOP_VCALLEE_CLOBBERS	"=a" (__eax)
+#endif
 #define PVOP_CALLEE_CLOBBERS	PVOP_VCALLEE_CLOBBERS
 
 #define EXTRA_CLOBBERS	 , "r8", "r9", "r10", "r11"
-- 
2.37.2.789.g6183377224-goog
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Kees Cook 3 years, 7 months ago
On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> [...]
>         callq   *pv_ops+536(%rip)

Do you know which pv_ops function is this? I can't figure out where
pte_offset_kernel() gets converted into a pv_ops call....

> [...]
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
>  				"=c" (__ecx)
>  #define PVOP_CALL_CLOBBERS	PVOP_VCALL_CLOBBERS, "=a" (__eax)
>  
> -/* void functions are still allowed [re]ax for scratch */
> +/*
> + * void functions are still allowed [re]ax for scratch.
> + *
> + * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
> + * registers. Make sure we model this with the appropriate clobbers.
> + */
> +#ifdef CONFIG_ZERO_CALL_USED_REGS
> +#define PVOP_VCALLEE_CLOBBERS	"=a" (__eax), PVOP_VCALL_CLOBBERS
> +#else
>  #define PVOP_VCALLEE_CLOBBERS	"=a" (__eax)
> +#endif
>  #define PVOP_CALLEE_CLOBBERS	PVOP_VCALLEE_CLOBBERS

I don't think this should depend on CONFIG_ZERO_CALL_USED_REGS; it should
always be present.

I've only been looking at this just now, so many I'm missing
something. The callee clobbers are for functions with return values,
yes?

For example, 32-bit has to manually deal with doing a 64-bit value return,
and even got it wrong originally, fixing it in commit 0eb592dbba40
("x86/paravirt: return full 64-bit result"), with:

-#define PVOP_VCALLEE_CLOBBERS          "=a" (__eax)
+#define PVOP_VCALLEE_CLOBBERS          "=a" (__eax), "=d" (__edx)

But the naming is confusing, since these aren't actually clobbers,
they're input constraints marked as clobbers (the "=" modifier).

Regardless, the note in the comments ...

 ...
 * However, x86_64 also have to clobber all caller saved registers, which
 * unfortunately, are quite a bit (r8 - r11)
 ...

... would indicate that ALL the function argument registers need to be
marked as clobbers (i.e. the compiler can't figure this out on its own).

I was going to say it seems like they're missing from EXTRA_CLOBBERS,
but it's not used with any of the macros using PVOP_VCALLEE_CLOBBERS,
and then I saw the weird alternatives patching that encodes the clobbers
a second time (CLBR_ANY vs CLBR_RET_REG) via:

#define _paravirt_alt(insn_string, type, clobber)       \
        "771:\n\t" insn_string "\n" "772:\n"            \
        ".pushsection .parainstructions,\"a\"\n"        \
        _ASM_ALIGN "\n"                                 \
        _ASM_PTR " 771b\n"                              \
        "  .byte " type "\n"                            \
        "  .byte 772b-771b\n"                           \
        "  .short " clobber "\n"                        \
        ".popsection\n"

And after reading the alternatives patching code which parses this via
the following struct:

/* These all sit in the .parainstructions section to tell us what to patch. */
struct paravirt_patch_site {
        u8 *instr;              /* original instructions */
        u8 type;                /* type of this instruction */
        u8 len;                 /* length of original instruction */
};

... I see it _doesn't use the clobbers_ at all! *head explode* I found
that removal in commit 27876f3882fd ("x86/paravirt: Remove clobbers from
struct paravirt_patch_site")

So, I guess the CLBR_* can all be entirely removed. But back to my other
train of thought...

It seems like all the input registers need to be explicitly listed in
the PVOP_VCALLEE_CLOBBERS list (as you have), but likely should be done
unconditionally and for 32-bit as well.

-Kees

(Also, please CC linux-hardening@vger.kernel.org.)

-- 
Kees Cook
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Bill Wendling 3 years, 7 months ago
On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > [...]
> >         callq   *pv_ops+536(%rip)
>
> Do you know which pv_ops function is this? I can't figure out where
> pte_offset_kernel() gets converted into a pv_ops call....
>
This one is _paravirt_ident_64, I believe. I think that the original
issue Nathan was seeing was with another seemingly innocuous function.

> > [...]
> > --- a/arch/x86/include/asm/paravirt_types.h
> > +++ b/arch/x86/include/asm/paravirt_types.h
> > @@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
> >                               "=c" (__ecx)
> >  #define PVOP_CALL_CLOBBERS   PVOP_VCALL_CLOBBERS, "=a" (__eax)
> >
> > -/* void functions are still allowed [re]ax for scratch */
> > +/*
> > + * void functions are still allowed [re]ax for scratch.
> > + *
> > + * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
> > + * registers. Make sure we model this with the appropriate clobbers.
> > + */
> > +#ifdef CONFIG_ZERO_CALL_USED_REGS
> > +#define PVOP_VCALLEE_CLOBBERS        "=a" (__eax), PVOP_VCALL_CLOBBERS
> > +#else
> >  #define PVOP_VCALLEE_CLOBBERS        "=a" (__eax)
> > +#endif
> >  #define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
>
> I don't think this should depend on CONFIG_ZERO_CALL_USED_REGS; it should
> always be present.
>
> I've only been looking at this just now, so many I'm missing
> something. The callee clobbers are for functions with return values,
> yes?
>
Kinda. It seems that the usage here is to let the compiler know that a
register may be modified by the callee, not just that it's an "actual"
return value. So it's suitable for void functions.

> For example, 32-bit has to manually deal with doing a 64-bit value return,
> and even got it wrong originally, fixing it in commit 0eb592dbba40
> ("x86/paravirt: return full 64-bit result"), with:
>
> -#define PVOP_VCALLEE_CLOBBERS          "=a" (__eax)
> +#define PVOP_VCALLEE_CLOBBERS          "=a" (__eax), "=d" (__edx)
>
> But the naming is confusing, since these aren't actually clobbers,
> they're input constraints marked as clobbers (the "=" modifier).
>
Right.

> Regardless, the note in the comments ...
>
>  ...
>  * However, x86_64 also have to clobber all caller saved registers, which
>  * unfortunately, are quite a bit (r8 - r11)
>  ...
>
> ... would indicate that ALL the function argument registers need to be
> marked as clobbers (i.e. the compiler can't figure this out on its own).
>
Good point. And there are some forms of these macros that specify
those as clobbers.

> I was going to say it seems like they're missing from EXTRA_CLOBBERS,
> but it's not used with any of the macros using PVOP_VCALLEE_CLOBBERS,
> and then I saw the weird alternatives patching that encodes the clobbers
> a second time (CLBR_ANY vs CLBR_RET_REG) via:
>
> #define _paravirt_alt(insn_string, type, clobber)       \
>         "771:\n\t" insn_string "\n" "772:\n"            \
>         ".pushsection .parainstructions,\"a\"\n"        \
>         _ASM_ALIGN "\n"                                 \
>         _ASM_PTR " 771b\n"                              \
>         "  .byte " type "\n"                            \
>         "  .byte 772b-771b\n"                           \
>         "  .short " clobber "\n"                        \
>         ".popsection\n"
>
> And after reading the alternatives patching code which parses this via
> the following struct:
>
> /* These all sit in the .parainstructions section to tell us what to patch. */
> struct paravirt_patch_site {
>         u8 *instr;              /* original instructions */
>         u8 type;                /* type of this instruction */
>         u8 len;                 /* length of original instruction */
> };
>
> ... I see it _doesn't use the clobbers_ at all! *head explode* I found
> that removal in commit 27876f3882fd ("x86/paravirt: Remove clobbers from
> struct paravirt_patch_site")
>
> So, I guess the CLBR_* can all be entirely removed. But back to my other
> train of thought...
>
[switches stations]

> It seems like all the input registers need to be explicitly listed in
> the PVOP_VCALLEE_CLOBBERS list (as you have), but likely should be done
> unconditionally and for 32-bit as well.
>
Possibly, though it may cause significant code degradation when the
compiler needs to store a value that's live over the ASM statement,
but the register it's in isn't actually modified. I saw that in the
example I gave in the description. In the case where a "movq" is used,
there's a useless move of "rdi" into "r11".

> (Also, please CC linux-hardening@vger.kernel.org.)
>
Doh! Someday I'll learn email.

-bw
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Nick Desaulniers 3 years, 7 months ago
On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <morbo@google.com> wrote:
>
> On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > [...]
> > >         callq   *pv_ops+536(%rip)
> >
> > Do you know which pv_ops function is this? I can't figure out where
> > pte_offset_kernel() gets converted into a pv_ops call....
> >
> This one is _paravirt_ident_64, I believe. I think that the original
> issue Nathan was seeing was with another seemingly innocuous function.

_paravirt_ident_64 is marked noinstr, which makes me suspect that it
really needs to not be touched at all by the compiler for
these...special features.

Maybe the definition of noinstr in include/linux/compiler_types.h
should be adding __attribute__((zero_call_used_regs(skip)))?

Untested:

```
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 4f2a819fd60a..a51ab77e2da8 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -226,10 +226,17 @@ struct ftrace_likely_data {
 #define __no_sanitize_or_inline __always_inline
 #endif

+#ifdef CONFIG_ZERO_CALL_USED_REGS
+#define __no_zero_call_used_regs __attribute__((__zero_call_used_reg__(skip)))
+#else
+#define __no_zero_call_used_regs
+#endif
+
 /* Section for code which can't be instrumented at all */
 #define noinstr
         \
        noinline notrace __attribute((__section__(".noinstr.text")))    \
-       __no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
+       __no_kcsan __no_sanitize_address __no_profile                   \
+       __no_sanitize_coverage __no_zero_call_used_regs

 #endif /* __KERNEL__ */
```
Or use __has_attribute in include/linux/compiler_attributes.h.
-- 
Thanks,
~Nick Desaulniers
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Peter Zijlstra 3 years, 7 months ago
On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <morbo@google.com> wrote:
> >
> > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > [...]
> > > >         callq   *pv_ops+536(%rip)
> > >
> > > Do you know which pv_ops function is this? I can't figure out where
> > > pte_offset_kernel() gets converted into a pv_ops call....
> > >
> > This one is _paravirt_ident_64, I believe. I think that the original
> > issue Nathan was seeing was with another seemingly innocuous function.
> 
> _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> really needs to not be touched at all by the compiler for
> these...special features.

My source tree sayeth:

  u64 notrace _paravirt_ident_64(u64 x)

And that function is only ever called at boot, after alternatives runs
it's patched with:

  mov %_ASM_ARG1, %_ASM_AX

Anyway, if you want to take it away from the compiler, something like
so should do.


diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 7ca2d46c08cc..8922e2887779 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -80,11 +80,16 @@ static unsigned paravirt_patch_call(void *insn_buff, const void *target,
 }
 
 #ifdef CONFIG_PARAVIRT_XXL
-/* identity function, which can be inlined */
-u64 notrace _paravirt_ident_64(u64 x)
-{
-	return x;
-}
+extern u64 _paravirt_ident_64(u64 x);
+asm (".pushsection .entry.text, \"ax\"\n"
+     ".global _paravirt_ident_64\n"
+     "_paravirt_ident_64:\n\t"
+     ASM_ENDBR
+     "mov %" _ASM_ARG1 ", %" _ASM_AX "\n\t"
+     ASM_RET
+     ".size _paravirt_ident_64, . - _paravirt_ident_64\n\t"
+     ".type _paravirt_ident_64, @function\n\t"
+     ".popsection");
 #endif
 
 DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Kees Cook 3 years, 7 months ago
On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <morbo@google.com> wrote:
> > >
> > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
> > > >
> > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > [...]
> > > > >         callq   *pv_ops+536(%rip)
> > > >
> > > > Do you know which pv_ops function is this? I can't figure out where
> > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > >
> > > This one is _paravirt_ident_64, I believe. I think that the original
> > > issue Nathan was seeing was with another seemingly innocuous function.
> > 
> > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > really needs to not be touched at all by the compiler for
> > these...special features.
> 
> My source tree sayeth:
> 
>   u64 notrace _paravirt_ident_64(u64 x)

I don't see noinstr either. But it seems a reasonable thing to do?

Bill, does fixing up the noinstr macro and adding it here fix the
problem?

-- 
Kees Cook
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Bill Wendling 3 years, 6 months ago
On Thu, Sep 8, 2022 at 12:10 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <morbo@google.com> wrote:
> > > >
> > > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
> > > > >
> > > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > > [...]
> > > > > >         callq   *pv_ops+536(%rip)
> > > > >
> > > > > Do you know which pv_ops function is this? I can't figure out where
> > > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > > >
> > > > This one is _paravirt_ident_64, I believe. I think that the original
> > > > issue Nathan was seeing was with another seemingly innocuous function.
> > >
> > > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > > really needs to not be touched at all by the compiler for
> > > these...special features.
> >
> > My source tree sayeth:
> >
> >   u64 notrace _paravirt_ident_64(u64 x)
>
> I don't see noinstr either. But it seems a reasonable thing to do?
>
> Bill, does fixing up the noinstr macro and adding it here fix the
> problem?
>
[sorry for late response]

Let me give it a shot. I'll also test out Peter's suggestion, which
might be a better option in the long run. I suspect that we'll need to
devise similar patches for other places, but it shouldn't be hard to
find them all.

-bw

-bw
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Nathan Chancellor 3 years, 6 months ago
On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <morbo@google.com> wrote:
> > >
> > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
> > > >
> > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > [...]
> > > > >         callq   *pv_ops+536(%rip)
> > > >
> > > > Do you know which pv_ops function is this? I can't figure out where
> > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > >
> > > This one is _paravirt_ident_64, I believe. I think that the original
> > > issue Nathan was seeing was with another seemingly innocuous function.
> > 
> > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > really needs to not be touched at all by the compiler for
> > these...special features.
> 
> My source tree sayeth:
> 
>   u64 notrace _paravirt_ident_64(u64 x)
> 
> And that function is only ever called at boot, after alternatives runs
> it's patched with:
> 
>   mov %_ASM_ARG1, %_ASM_AX
> 
> Anyway, if you want to take it away from the compiler, something like
> so should do.

This appears to work fine for me in QEMU, as I can still boot with
CONFIG_ZERO_CALL_USED_REGS and spawn a nested guest without any issues.

> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 7ca2d46c08cc..8922e2887779 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -80,11 +80,16 @@ static unsigned paravirt_patch_call(void *insn_buff, const void *target,
>  }
>  
>  #ifdef CONFIG_PARAVIRT_XXL
> -/* identity function, which can be inlined */
> -u64 notrace _paravirt_ident_64(u64 x)
> -{
> -	return x;
> -}
> +extern u64 _paravirt_ident_64(u64 x);
> +asm (".pushsection .entry.text, \"ax\"\n"
> +     ".global _paravirt_ident_64\n"
> +     "_paravirt_ident_64:\n\t"
> +     ASM_ENDBR
> +     "mov %" _ASM_ARG1 ", %" _ASM_AX "\n\t"
> +     ASM_RET
> +     ".size _paravirt_ident_64, . - _paravirt_ident_64\n\t"
> +     ".type _paravirt_ident_64, @function\n\t"
> +     ".popsection");
>  #endif
>  
>  DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
>
Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
Posted by Bill Wendling 3 years, 6 months ago
On Wed, Sep 14, 2022 at 3:41 PM Nathan Chancellor <nathan@kernel.org> wrote:
> On Wed, Sep 07, 2022 at 10:50:03AM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 11:00:07PM -0700, Nick Desaulniers wrote:
> > > On Sun, Sep 4, 2022 at 11:02 PM Bill Wendling <morbo@google.com> wrote:
> > > >
> > > > On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@chromium.org> wrote:
> > > > >
> > > > > On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > > > > > [...]
> > > > > >         callq   *pv_ops+536(%rip)
> > > > >
> > > > > Do you know which pv_ops function is this? I can't figure out where
> > > > > pte_offset_kernel() gets converted into a pv_ops call....
> > > > >
> > > > This one is _paravirt_ident_64, I believe. I think that the original
> > > > issue Nathan was seeing was with another seemingly innocuous function.
> > >
> > > _paravirt_ident_64 is marked noinstr, which makes me suspect that it
> > > really needs to not be touched at all by the compiler for
> > > these...special features.
> >
> > My source tree sayeth:
> >
> >   u64 notrace _paravirt_ident_64(u64 x)
> >
> > And that function is only ever called at boot, after alternatives runs
> > it's patched with:
> >
> >   mov %_ASM_ARG1, %_ASM_AX
> >
> > Anyway, if you want to take it away from the compiler, something like
> > so should do.
>
> This appears to work fine for me in QEMU, as I can still boot with
> CONFIG_ZERO_CALL_USED_REGS and spawn a nested guest without any issues.
>
Thanks, Nathan. I much prefer to use this patch then and file a
separate issue to investigate the clobbers issue for later.

-bw

> > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> > index 7ca2d46c08cc..8922e2887779 100644
> > --- a/arch/x86/kernel/paravirt.c
> > +++ b/arch/x86/kernel/paravirt.c
> > @@ -80,11 +80,16 @@ static unsigned paravirt_patch_call(void *insn_buff, const void *target,
> >  }
> >
> >  #ifdef CONFIG_PARAVIRT_XXL
> > -/* identity function, which can be inlined */
> > -u64 notrace _paravirt_ident_64(u64 x)
> > -{
> > -     return x;
> > -}
> > +extern u64 _paravirt_ident_64(u64 x);
> > +asm (".pushsection .entry.text, \"ax\"\n"
> > +     ".global _paravirt_ident_64\n"
> > +     "_paravirt_ident_64:\n\t"
> > +     ASM_ENDBR
> > +     "mov %" _ASM_ARG1 ", %" _ASM_AX "\n\t"
> > +     ASM_RET
> > +     ".size _paravirt_ident_64, . - _paravirt_ident_64\n\t"
> > +     ".type _paravirt_ident_64, @function\n\t"
> > +     ".popsection");
> >  #endif
> >
> >  DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
> >