[GIT pull] x86/urgent for v6.16-rc7

Thomas Gleixner posted 1 patch 2 months, 2 weeks ago
arch/x86/coco/sev/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[GIT pull] x86/urgent for v6.16-rc7
Posted by Thomas Gleixner 2 months, 2 weeks ago
Linus,

please pull the latest x86/urgent branch from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-urgent-2025-07-20

up to:  6b995d01683f: x86/sev: Work around broken noinstr on GCC

A single fix for a GCC wreckage, which emits a KCSAN instrumentation call
in __sev_es_nmi_complete() despite the function being annotated with
'noinstr'. As all functions in that source file are noinstr, exclude the
whole file from KCSAN in the Makefile to cure it.

Thanks,

	tglx

------------------>
Ard Biesheuvel (1):
      x86/sev: Work around broken noinstr on GCC


 arch/x86/coco/sev/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/sev/Makefile b/arch/x86/coco/sev/Makefile
index db3255b979bd..342d79f0ab6a 100644
--- a/arch/x86/coco/sev/Makefile
+++ b/arch/x86/coco/sev/Makefile
@@ -5,5 +5,6 @@ obj-y += core.o sev-nmi.o vc-handle.o
 # Clang 14 and older may fail to respect __no_sanitize_undefined when inlining
 UBSAN_SANITIZE_sev-nmi.o	:= n
 
-# GCC may fail to respect __no_sanitize_address when inlining
+# GCC may fail to respect __no_sanitize_address or __no_kcsan when inlining
 KASAN_SANITIZE_sev-nmi.o	:= n
+KCSAN_SANITIZE_sev-nmi.o	:= n
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by pr-tracker-bot@kernel.org 2 months, 2 weeks ago
The pull request you sent on Sun, 20 Jul 2025 14:05:01 +0200 (CEST):

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-urgent-2025-07-20

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/07fa9cad54609df3eea00cd5b167df6088ce01a6

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by Linus Torvalds 2 months, 2 weeks ago
On Sun, 20 Jul 2025 at 05:05, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> A single fix for a GCC wreckage, which emits a KCSAN instrumentation call
> in __sev_es_nmi_complete() despite the function being annotated with
> 'noinstr'. As all functions in that source file are noinstr, exclude the
> whole file from KCSAN in the Makefile to cure it.

Hmm. I'm not entirely sure if this counts as a gcc bug.

In particular, look at the *declaration* of __sev_es_nmi_complete() in
<asm/sev.h>, and note the complete lack of 'noinstr':

    extern void __sev_es_nmi_complete(void);

and I wonder if it might be the case that gcc might pick up the lack
of 'noinstr' from the declaration, even if it's there in the
definition..

The fix for this obviously ends up working and is fine regardless, but
it's _possible_ that this is self-inflicted pain rather than an
outright compiler bug. Because having a declaration and a definition
that doesn't match sounds like a bad idea in the first place.

               Linus
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by tglx 2 months, 2 weeks ago
On Sun, Jul 20 2025 at 11:35, Linus Torvalds wrote:
> On Sun, 20 Jul 2025 at 05:05, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> A single fix for a GCC wreckage, which emits a KCSAN instrumentation call
>> in __sev_es_nmi_complete() despite the function being annotated with
>> 'noinstr'. As all functions in that source file are noinstr, exclude the
>> whole file from KCSAN in the Makefile to cure it.
>
> Hmm. I'm not entirely sure if this counts as a gcc bug.
>
> In particular, look at the *declaration* of __sev_es_nmi_complete() in
> <asm/sev.h>, and note the complete lack of 'noinstr':
>
>     extern void __sev_es_nmi_complete(void);
>
> and I wonder if it might be the case that gcc might pick up the lack
> of 'noinstr' from the declaration, even if it's there in the
> definition..
>
> The fix for this obviously ends up working and is fine regardless, but
> it's _possible_ that this is self-inflicted pain rather than an
> outright compiler bug.

I agree. See below.

> Because having a declaration and a definition that doesn't match
> sounds like a bad idea in the first place.

I disagree here. When the compiler evaluates the function body it must
have evaluated noinstr on the definition, no?

Unfortunately the data provided in the change log does not tell what was
actually instrumented inside of that function. I just reproduced it
locally.

The problem are the invocations of ghcb_set_sw_*(), which are
implemented through a macro maze, but end up being marked
__always_inline all the way down. __always_inline is supposed to
guarantee that the noinstr (or other) constraints of the calling
function are preserved.

What makes these ghcb_set_sw_*() inlines so special?

They are defined by DEFINE_GHCB_ACCESSORS(field) and end up with this:

	static __always_inline void ghcb_set_##field(struct ghcb *ghcb, u64 value) \
	{									\
		__set_bit(GHCB_BITMAP_IDX(field),				\
			  (unsigned long *)&ghcb->save.valid_bitmap);		\
		ghcb->save.field = value;					\
	}

__set_bit() resolves to:

static __always_inline void ___set_bit(unsigned long nr, volatile unsigned long *addr)
{
	instrument_write(addr + BIT_WORD(nr), sizeof(long));
	arch___set_bit(nr, addr);
}

instrument_write() resolves to

static __always_inline void instrument_write(const volatile void *v, size_t size)
{
	kasan_check_write(v, size);
	kcsan_check_write(v, size);
}

and kcsan_check_write() is:

#define __kcsan_check_write(ptr, size)                                         \
	__kcsan_check_access(ptr, size, KCSAN_ACCESS_WRITE)

__kcsan_check_access() is a function provided by the kernel. So GCC just
does as told and emits a function call.

If I replace the __set_bit() in ghcb_set_##field() with arch___set_bit()
the problem goes away.

Excluding the whole file from KCSAN switches to the non-instrumented
version of __set_bit() which directly resolves to arch___set_bit().

Thanks,

        tglx
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by Ard Biesheuvel 2 months, 2 weeks ago
(cc Marco, Daniel)

Hi Thomas,

Thanks for getting to the bottom of this.

On Mon, 21 Jul 2025 at 16:08, tglx <tglx@linutronix.de> wrote:
>
> On Sun, Jul 20 2025 at 11:35, Linus Torvalds wrote:
> > On Sun, 20 Jul 2025 at 05:05, Thomas Gleixner <tglx@linutronix.de> wrote:
> >>
> >> A single fix for a GCC wreckage, which emits a KCSAN instrumentation call
> >> in __sev_es_nmi_complete() despite the function being annotated with
> >> 'noinstr'. As all functions in that source file are noinstr, exclude the
> >> whole file from KCSAN in the Makefile to cure it.
> >
> > Hmm. I'm not entirely sure if this counts as a gcc bug.
> >
> > In particular, look at the *declaration* of __sev_es_nmi_complete() in
> > <asm/sev.h>, and note the complete lack of 'noinstr':
> >
> >     extern void __sev_es_nmi_complete(void);
> >
> > and I wonder if it might be the case that gcc might pick up the lack
> > of 'noinstr' from the declaration, even if it's there in the
> > definition..
> >
> > The fix for this obviously ends up working and is fine regardless, but
> > it's _possible_ that this is self-inflicted pain rather than an
> > outright compiler bug.
>
> I agree. See below.
>
> > Because having a declaration and a definition that doesn't match
> > sounds like a bad idea in the first place.
>
> I disagree here. When the compiler evaluates the function body it must
> have evaluated noinstr on the definition, no?
>
> Unfortunately the data provided in the change log does not tell what was
> actually instrumented inside of that function. I just reproduced it
> locally.
>
...
> __set_bit() resolves to:
>
> static __always_inline void ___set_bit(unsigned long nr, volatile unsigned long *addr)
> {
>         instrument_write(addr + BIT_WORD(nr), sizeof(long));
>         arch___set_bit(nr, addr);
> }
>

So this is the issue right here: an __always_inline function that
unconditionally calls the KASAN/KCSAN check functions. And indeed, the
compiler is not to blame here, because these instrumentations were
emitted by our code, and in a manner that doesn't care about the
compiler attributes that disable KASAN or KCSAN.

The upshot of this is that all noinstr users of __set_bit() end up
calling the instrumented version if the kconfig happens to have KASAN
or KCSAN enabled, and I seriously doubt that this is what we want.
Including one header or the other obviously doesn't work when
disabling these instrumentations at the function level.

Ideally, we'd have __builtin_kasan_enabled() and
__builtin_kcsan_enabled() compiler builtins that provide the correct
answer for the current function, but that will take a while to land if
we start working on it now (for both Clang and GCC).
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by Ard Biesheuvel 2 months, 2 weeks ago
On Mon, 21 Jul 2025 at 04:35, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, 20 Jul 2025 at 05:05, Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > A single fix for a GCC wreckage, which emits a KCSAN instrumentation call
> > in __sev_es_nmi_complete() despite the function being annotated with
> > 'noinstr'. As all functions in that source file are noinstr, exclude the
> > whole file from KCSAN in the Makefile to cure it.
>
> Hmm. I'm not entirely sure if this counts as a gcc bug.
>
> In particular, look at the *declaration* of __sev_es_nmi_complete() in
> <asm/sev.h>, and note the complete lack of 'noinstr':
>
>     extern void __sev_es_nmi_complete(void);
>
> and I wonder if it might be the case that gcc might pick up the lack
> of 'noinstr' from the declaration, even if it's there in the
> definition..
>

Just tried this: adding 'noinstr' to the declaration in asm/sev.h
makes no difference at all.

> The fix for this obviously ends up working and is fine regardless, but
> it's _possible_ that this is self-inflicted pain rather than an
> outright compiler bug. Because having a declaration and a definition
> that doesn't match sounds like a bad idea in the first place.
>

Agree with this in principle, and for 'noinstr' in particular, this
may work fine but note that there are __attribute__(()) annotations
that make no sense, or are not permitted on [forward] declarations,
and so having the general rule that declarations and definitions must
have the same annotations may not be feasible in practice.
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by Linus Torvalds 2 months, 2 weeks ago
On Sun, 20 Jul 2025 at 20:14, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Just tried this: adding 'noinstr' to the declaration in asm/sev.h
> makes no difference at all.

Ok, thanks for checking.  It does seem a strange bug.

That said, this area is a mess, and I really do think it's at least
partly *our* mess.

We should mark not only __sev_es_nmi_complete(), but the
sev_es_nmi_complete() inline function wrapper as being 'noinstr'.

But we can't do that, because 'noinstr' explicitly includes
'noinline', so we cannot do something sane like

  static __always_inline noinstr void sev_es_nmi_complete(void) ..

for the wrapper, because the compiler will very correctly say "I'm
sorry Dave, I can't do that".

So I still suspect that yes, it may be a gcc bug, but we're really
doing some things that make me go "at some point you really can't
blame the compiler too much for being confused".

I guess it doesn't matter - the bug is fixed, but I'd personally
hesitate to make a gcc bug report simply because if I was a compiler
person, I would take one look at this code and say "you're insane".

        Linus
Re: [GIT pull] x86/urgent for v6.16-rc7
Posted by Ard Biesheuvel 2 months, 2 weeks ago
On Mon, 21 Jul 2025 at 13:32, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, 20 Jul 2025 at 20:14, Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > Just tried this: adding 'noinstr' to the declaration in asm/sev.h
> > makes no difference at all.
>
> Ok, thanks for checking.  It does seem a strange bug.
>
> That said, this area is a mess, and I really do think it's at least
> partly *our* mess.
>
> We should mark not only __sev_es_nmi_complete(), but the
> sev_es_nmi_complete() inline function wrapper as being 'noinstr'.
>
> But we can't do that, because 'noinstr' explicitly includes
> 'noinline', so we cannot do something sane like
>
>   static __always_inline noinstr void sev_es_nmi_complete(void) ..
>
> for the wrapper, because the compiler will very correctly say "I'm
> sorry Dave, I can't do that".
>

I also wonder what the semantics should be in this case, though:
attributes are generally tracked at function granularity or coarser,
and so inlining a noinstr function should either disable
instrumentation for the whole outer function, or disregard the
noinstr, as applying it only to the inlined code is not generally
possible.