[PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives

Sohil Mehta posted 15 patches 4 months ago
There is a newer version of this series
[PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Sohil Mehta 4 months ago
For patching, the kernel initializes a temporary mm area in the lower
half of the address range. Disable LASS enforcement by toggling the
RFLAGS.AC bit during patching to avoid triggering a #GP fault.

Introduce LASS-specific stac()/clac() helpers along with comments to
clarify their usage versus the existing stac()/clac() helpers for SMAP.

Text poking functions used while patching kernel alternatives use the
standard memcpy()/memset(). However, objtool complains about calling
dynamic functions within an AC=1 region.

One workaround is to add memcpy() and memset() to the list of functions
allowed by objtool. However, that would provide a blanket exemption for
all usages of memcpy() and memset().

Instead, replace the standard memcpy() and memset() calls in the text
poking functions with their unoptimized inline versions. Considering
that patching is usually small, there is no performance impact expected.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
v10:
 - Revert to the inline functions instead of open-coding in assembly.
 - Simplify code comments.
---
 arch/x86/include/asm/smap.h   | 35 +++++++++++++++++++++++++++++++++--
 arch/x86/kernel/alternative.c | 18 ++++++++++++++++--
 2 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..3ecb4b0de1f9 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -23,18 +23,49 @@
 
 #else /* __ASSEMBLER__ */
 
+/*
+ * The CLAC/STAC instructions toggle the enforcement of X86_FEATURE_SMAP
+ * and X86_FEATURE_LASS.
+ *
+ * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page
+ * tables. The kernel is not allowed to touch pages with the bit set
+ * unless the AC bit is set.
+ *
+ * Use stac()/clac() when accessing userspace (_PAGE_USER) mappings,
+ * regardless of location.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
 static __always_inline void clac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
 	alternative("", "clac", X86_FEATURE_SMAP);
 }
 
 static __always_inline void stac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
 	alternative("", "stac", X86_FEATURE_SMAP);
 }
 
+/*
+ * LASS enforcement is based on bit 63 of the virtual address. The
+ * kernel is not allowed to touch memory in the lower half of the
+ * virtual address space unless the AC bit is set.
+ *
+ * Use lass_stac()/lass_clac() when accessing kernel mappings
+ * (!_PAGE_USER) in the lower half of the address space.
+ */
+
+static __always_inline void lass_clac(void)
+{
+	alternative("", "clac", X86_FEATURE_LASS);
+}
+
+static __always_inline void lass_stac(void)
+{
+	alternative("", "stac", X86_FEATURE_LASS);
+}
+
 static __always_inline unsigned long smap_save(void)
 {
 	unsigned long flags;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 79ae9cb50019..dc90b421d760 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2409,16 +2409,30 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
 __ro_after_init struct mm_struct *text_poke_mm;
 __ro_after_init unsigned long text_poke_mm_addr;
 
+/*
+ * Text poking creates and uses a mapping in the lower half of the
+ * address space. Relax LASS enforcement when accessing the poking
+ * address.
+ *
+ * Also, objtool enforces a strict policy of "no function calls within
+ * AC=1 regions". Adhere to the policy by using inline versions of
+ * memcpy()/memset() that will never result in a function call.
+ */
+
 static void text_poke_memcpy(void *dst, const void *src, size_t len)
 {
-	memcpy(dst, src, len);
+	lass_stac();
+	__inline_memcpy(dst, src, len);
+	lass_clac();
 }
 
 static void text_poke_memset(void *dst, const void *src, size_t len)
 {
 	int c = *(const int *)src;
 
-	memset(dst, c, len);
+	lass_stac();
+	__inline_memset(dst, c, len);
+	lass_clac();
 }
 
 typedef void text_poke_f(void *dst, const void *src, size_t len);
-- 
2.43.0
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Borislav Petkov 3 months, 2 weeks ago
On Mon, Oct 06, 2025 at 11:51:07PM -0700, Sohil Mehta wrote:
> +static __always_inline void lass_clac(void)
> +{
> +	alternative("", "clac", X86_FEATURE_LASS);
> +}
> +
> +static __always_inline void lass_stac(void)
> +{
> +	alternative("", "stac", X86_FEATURE_LASS);
> +}

So I probably missed the whole discussion on how we arrived at
lass_{stac,clac}() but just in case, those names sound silly.

IOW, I'd do this ontop:

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 3ecb4b0de1f9..066d83a6b1ff 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -55,16 +55,8 @@ static __always_inline void stac(void)
  * Use lass_stac()/lass_clac() when accessing kernel mappings
  * (!_PAGE_USER) in the lower half of the address space.
  */
-
-static __always_inline void lass_clac(void)
-{
-	alternative("", "clac", X86_FEATURE_LASS);
-}
-
-static __always_inline void lass_stac(void)
-{
-	alternative("", "stac", X86_FEATURE_LASS);
-}
+#define lass_disable()		stac()
+#define lass_enable()		clac()
 
 static __always_inline unsigned long smap_save(void)
 {
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 6a96dbc60bf1..6cdf5c226c51 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2487,18 +2487,18 @@ __ro_after_init unsigned long text_poke_mm_addr;
 
 static void text_poke_memcpy(void *dst, const void *src, size_t len)
 {
-	lass_stac();
+	lass_disable();
 	__inline_memcpy(dst, src, len);
-	lass_clac();
+	lass_enable();
 }
 
 static void text_poke_memset(void *dst, const void *src, size_t len)
 {
 	int c = *(const int *)src;
 
-	lass_stac();
+	lass_disable();
 	__inline_memset(dst, c, len);
-	lass_clac();
+	lass_enable();
 }
 
 typedef void text_poke_f(void *dst, const void *src, size_t len);

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Peter Zijlstra 3 months, 2 weeks ago
On Tue, Oct 21, 2025 at 10:03:28PM +0200, Borislav Petkov wrote:
> On Mon, Oct 06, 2025 at 11:51:07PM -0700, Sohil Mehta wrote:
> > +static __always_inline void lass_clac(void)
> > +{
> > +	alternative("", "clac", X86_FEATURE_LASS);
> > +}
> > +
> > +static __always_inline void lass_stac(void)
> > +{
> > +	alternative("", "stac", X86_FEATURE_LASS);
> > +}
> 
> So I probably missed the whole discussion on how we arrived at
> lass_{stac,clac}() but just in case, those names sound silly.
> 

Initially the suggestion was to use stac/clac directly iirc; but that
looses the information these are for LASS only. Hence the LASS specific
ones.

(its an unfortunate arch detail that LASS and SMAP both use the AC flag
and all that)

> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
> index 3ecb4b0de1f9..066d83a6b1ff 100644
> --- a/arch/x86/include/asm/smap.h
> +++ b/arch/x86/include/asm/smap.h
> @@ -55,16 +55,8 @@ static __always_inline void stac(void)
>   * Use lass_stac()/lass_clac() when accessing kernel mappings
>   * (!_PAGE_USER) in the lower half of the address space.
>   */
> -
> -static __always_inline void lass_clac(void)
> -{
> -	alternative("", "clac", X86_FEATURE_LASS);
> -}
> -
> -static __always_inline void lass_stac(void)
> -{
> -	alternative("", "stac", X86_FEATURE_LASS);
> -}
> +#define lass_disable()		stac()
> +#define lass_enable()		clac()

But that's not the same, stac() and clac() are FEATURE_SMAP, these are
FEATURE_LASS.

If you really want the _disable _enable naming that's fine with me, but
then perhaps we should also s/clac/smap_disable/ and s/stac/smap_enable/
for consistency.
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Borislav Petkov 3 months, 2 weeks ago
On Wed, Oct 22, 2025 at 10:25:41AM +0200, Peter Zijlstra wrote:
> Initially the suggestion was to use stac/clac directly iirc; but that
> looses the information these are for LASS only. Hence the LASS specific
> ones.

Yap.

> (its an unfortunate arch detail that LASS and SMAP both use the AC flag
> and all that)

That is an implementation detail and users of the interface shouldn't care.

> But that's not the same, stac() and clac() are FEATURE_SMAP, these are
> FEATURE_LASS.

So?

Are you thinking of toggling features and then something else getting disabled
in the process?

> If you really want the _disable _enable naming that's fine with me, but
> then perhaps we should also s/clac/smap_disable/ and s/stac/smap_enable/
> for consistency.

So the enable/disable thing is I think what makes this a lot more
understandable when you read it this way: "disable linear address separation
around this code". And that is regardless of how the underlying machinery does
that toggling of LASS.

As to stac/clac - I wouldn't touch them. They've been there forever so it'll
only be unnecessary churn.

Btw, if you need an example which already does that:

arch/x86/include/asm/uaccess.h:37:#define __uaccess_begin() stac()
arch/x86/include/asm/uaccess.h:38:#define __uaccess_end()   clac()

So the lass_{enable,disable} will be yet another incarnation of this pattern.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Peter Zijlstra 3 months, 2 weeks ago
On Wed, Oct 22, 2025 at 11:40:19AM +0200, Borislav Petkov wrote:

> > But that's not the same, stac() and clac() are FEATURE_SMAP, these are
> > FEATURE_LASS.
> 
> So?

That's confusing. Just keep them separate alternatives.

> Are you thinking of toggling features and then something else getting disabled
> in the process?

I'm thinking that machines without LASS don't need the clac/stac in
these places. And when reading the asm, the FEATURE_LASS is a clue.
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Borislav Petkov 3 months, 2 weeks ago
On Wed, Oct 22, 2025 at 12:22:53PM +0200, Peter Zijlstra wrote:
> I'm thinking that machines without LASS don't need the clac/stac in
> these places. And when reading the asm, the FEATURE_LASS is a clue.

Yeah, makes sense. The second alternative with LASS sounds like the optimal
thing.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Sohil Mehta 3 months, 2 weeks ago
On 10/21/2025 1:03 PM, Borislav Petkov wrote:
> On Mon, Oct 06, 2025 at 11:51:07PM -0700, Sohil Mehta wrote:
>> +static __always_inline void lass_clac(void)
>> +{
>> +	alternative("", "clac", X86_FEATURE_LASS);
>> +}
>> +
>> +static __always_inline void lass_stac(void)
>> +{
>> +	alternative("", "stac", X86_FEATURE_LASS);
>> +}
> 
> So I probably missed the whole discussion on how we arrived at
> lass_{stac,clac}() but just in case, those names sound silly.
> 

I am okay with lass_enable()/lass_disable() if we can all agree on it.

PeterZ didn't like lass_enable_enforcement()/lass_enable_enforcement()
when it was proposed. But your suggestion is much shorter, so maybe it
would work for him.
https://lore.kernel.org/lkml/20250626134921.GK1613200@noisy.programming.kicks-ass.net/

Though, there is a slight semantic difference we need to be careful
about. LASS manages 2 types of kernel accesses: Data and Instruction fetch.

1) The STAC/CLAC only control the kernel *data* accesses into the lower
half.

2) CR4.LASS is what truly controls the entire mechanism. If an
instruction fetch needs to happen from a lower address, CR4.LASS must be
cleared to disable LASS completely. (See patch 6 and 7).

In the series, we directly write to the CR4 bits, so they don't have any
wrappers. But in the future, lass_enable()/lass_disable() could be
confusing if wrappers were added for the CR4 toggling.

> IOW, I'd do this ontop:
> 

> +#define lass_disable()		stac()
> +#define lass_enable()		clac()
>  

There is an issue here which you had originally objected to.
https://lore.kernel.org/lkml/20240710171836.GGZo7CbFJeZwLCZUAt@fat_crate.local/
https://lore.kernel.org/lkml/20240711012333.GAZo80FU30_x77otP4@fat_crate.local/

These new versions of lass_disable()/lass_enable() will toggle the AC
flag on older platforms without X86_FEATURE_LASS. It definitely makes
the code easier to read and maintain if we are okay with the minor
performance penalty.
>  static void text_poke_memcpy(void *dst, const void *src, size_t len)
>  {
> -	lass_stac();
> +	lass_disable();
>  	__inline_memcpy(dst, src, len);
> -	lass_clac();
> +	lass_enable();
>  }
>  
>  static void text_poke_memset(void *dst, const void *src, size_t len)
>  {
>  	int c = *(const int *)src;
>  
> -	lass_stac();
> +	lass_disable();
>  	__inline_memset(dst, c, len);
> -	lass_clac();
> +	lass_enable();
>  }
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Borislav Petkov 3 months, 2 weeks ago
On Tue, Oct 21, 2025 at 01:55:51PM -0700, Sohil Mehta wrote:
> In the series, we directly write to the CR4 bits, so they don't have any
> wrappers. But in the future, lass_enable()/lass_disable() could be
> confusing if wrappers were added for the CR4 toggling.

Are you envisioning to export the CR4.LASS toggling to users like those two or
is former going to be done only at those two places?

Because CR4 toggling is expensive so you probably don't want to do that very
often.

> There is an issue here which you had originally objected to.
> https://lore.kernel.org/lkml/20240710171836.GGZo7CbFJeZwLCZUAt@fat_crate.local/
> https://lore.kernel.org/lkml/20240711012333.GAZo80FU30_x77otP4@fat_crate.local/
> 
> These new versions of lass_disable()/lass_enable() will toggle the AC
> flag on older platforms without X86_FEATURE_LASS. It definitely makes
> the code easier to read and maintain if we are okay with the minor
> performance penalty.

Hmm, we probably should measure that. The text poking should be a relatively
seldom operation but we should at least do a quick measurement to see whether
something registers on the radar...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Sohil Mehta 3 months, 2 weeks ago
On 10/22/2025 2:56 AM, Borislav Petkov wrote:
> On Tue, Oct 21, 2025 at 01:55:51PM -0700, Sohil Mehta wrote:
>> In the series, we directly write to the CR4 bits, so they don't have any
>> wrappers. But in the future, lass_enable()/lass_disable() could be
>> confusing if wrappers were added for the CR4 toggling.
> 
> Are you envisioning to export the CR4.LASS toggling to users like those two or
> is former going to be done only at those two places?
> 
> Because CR4 toggling is expensive so you probably don't want to do that very
> often.
> 

I agree. My expectation is that those won't grow much beyond the
existing ones.

My understanding from your discussion with PeterZ is that we would use
lass_enable()/_disable() with the LASS alternatives but leave the
existing stac()/clac() as-is.

Below is the updated patch with the rename and the text to clarify usages.

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 4f84d421d1cf..4f4a4e0efff5 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -23,18 +23,52 @@

 #else /* __ASSEMBLER__ */

+/*
+ * The CLAC/STAC instructions toggle the enforcement of
+ * X86_FEATURE_SMAP along with X86_FEATURE_LASS.
+ *
+ * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page
+ * tables. The kernel is not allowed to touch pages with the bit set
+ * unless the AC bit is set.
+ *
+ * Use stac()/clac() when accessing userspace (_PAGE_USER) mappings,
+ * regardless of location.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
 static __always_inline void clac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
 	alternative("", "clac", X86_FEATURE_SMAP);
 }

 static __always_inline void stac(void)
 {
-	/* Note: a barrier is implicit in alternative() */
 	alternative("", "stac", X86_FEATURE_SMAP);
 }

+/*
+ * LASS enforcement is based on bit 63 of the virtual address. The
+ * kernel is not allowed to touch memory in the lower half of the
+ * virtual address space unless the AC bit is set.
+ *
+ * Use lass_disable()/lass_enable() when accessing kernel (!_PAGE_USER)
+ * mappings in the lower half of the address space that are blocked by
+ * LASS, but not by SMAP.
+ *
+ * Note: a barrier is implicit in alternative().
+ */
+
+static __always_inline void lass_enable(void)
+{
+	alternative("", "clac", X86_FEATURE_LASS);
+}
+
+static __always_inline void lass_disable(void)
+{
+	alternative("", "stac", X86_FEATURE_LASS);
+}
+
 static __always_inline unsigned long smap_save(void)
 {
 	unsigned long flags;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 8ee5ff547357..b38dbf08d5cd 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2469,16 +2469,30 @@ void __init_or_module text_poke_early(void
*addr, const void *opcode,
 __ro_after_init struct mm_struct *text_poke_mm;
 __ro_after_init unsigned long text_poke_mm_addr;

+/*
+ * Text poking creates and uses a mapping in the lower half of the
+ * address space. Relax LASS enforcement when accessing the poking
+ * address.
+ *
+ * objtool enforces a strict policy of "no function calls within AC=1
+ * regions". Adhere to the policy by using inline versions of
+ * memcpy()/memset() that will never result in a function call.
+ */
+
 static void text_poke_memcpy(void *dst, const void *src, size_t len)
 {
-	memcpy(dst, src, len);
+	lass_disable();
+	__inline_memcpy(dst, src, len);
+	lass_enable();
 }

 static void text_poke_memset(void *dst, const void *src, size_t len)
 {
 	int c = *(const int *)src;

-	memset(dst, c, len);
+	lass_disable();
+	__inline_memset(dst, c, len);
+	lass_enable();
 }

 typedef void text_poke_f(void *dst, const void *src, size_t len);
RE: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Luck, Tony 3 months, 2 weeks ago
> +/*
> + * LASS enforcement is based on bit 63 of the virtual address. The
> + * kernel is not allowed to touch memory in the lower half of the
> + * virtual address space unless the AC bit is set.
> + *
> + * Use lass_disable()/lass_enable() when accessing kernel (!_PAGE_USER)
> + * mappings in the lower half of the address space that are blocked by
> + * LASS, but not by SMAP.

Maybe "when accessing kernel data ..."

Also add that for instruction fetch CR4.LASS must be cleared.

-Tony
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Edgecombe, Rick P 4 months ago
It's not just used for alternatives anymore. bpf, kprobes, etc use it too. Maybe
drop "alternatives" from the subject?

On Mon, 2025-10-06 at 23:51 -0700, Sohil Mehta wrote:
> For patching, the kernel initializes a temporary mm area in the lower
> half of the address range. Disable LASS enforcement by toggling the
> RFLAGS.AC bit during patching to avoid triggering a #GP fault.
> 
> Introduce LASS-specific stac()/clac() helpers along with comments to
> clarify their usage versus the existing stac()/clac() helpers for SMAP.
> 
> Text poking functions used while patching kernel alternatives use the
> standard memcpy()/memset(). However, objtool complains about calling
> dynamic functions within an AC=1 region.
> 
> One workaround is to add memcpy() and memset() to the list of functions
> allowed by objtool. However, that would provide a blanket exemption for
> all usages of memcpy() and memset().
> 
> Instead, replace the standard memcpy() and memset() calls in the text
> poking functions with their unoptimized inline versions. Considering
> that patching is usually small, there is no performance impact expected.
> 
> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
> ---
> v10:
>  - Revert to the inline functions instead of open-coding in assembly.
>  - Simplify code comments.
> ---
>  arch/x86/include/asm/smap.h   | 35 +++++++++++++++++++++++++++++++++--
>  arch/x86/kernel/alternative.c | 18 ++++++++++++++++--
>  2 files changed, 49 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
> index 4f84d421d1cf..3ecb4b0de1f9 100644
> --- a/arch/x86/include/asm/smap.h
> +++ b/arch/x86/include/asm/smap.h
> @@ -23,18 +23,49 @@
>  
>  #else /* __ASSEMBLER__ */
>  
> +/*
> + * The CLAC/STAC instructions toggle the enforcement of X86_FEATURE_SMAP
> + * and X86_FEATURE_LASS.
> + *
> + * SMAP enforcement is based on the _PAGE_BIT_USER bit in the page
> + * tables. The kernel is not allowed to touch pages with the bit set
> + * unless the AC bit is set.
> + *
> + * Use stac()/clac() when accessing userspace (_PAGE_USER) mappings,
> + * regardless of location.
> + *
> + * Note: a barrier is implicit in alternative().
> + */
> +
>  static __always_inline void clac(void)
>  {
> -	/* Note: a barrier is implicit in alternative() */
>  	alternative("", "clac", X86_FEATURE_SMAP);
>  }
>  
>  static __always_inline void stac(void)
>  {
> -	/* Note: a barrier is implicit in alternative() */
>  	alternative("", "stac", X86_FEATURE_SMAP);
>  }
>  
> +/*
> + * LASS enforcement is based on bit 63 of the virtual address. The
> + * kernel is not allowed to touch memory in the lower half of the
> + * virtual address space unless the AC bit is set.
> + *
> + * Use lass_stac()/lass_clac() when accessing kernel mappings
> + * (!_PAGE_USER) in the lower half of the address space.
> + */
> +

The above variant has "a barrier is implicit in alternative", is it not needed
here too? Actually, not sure what that comment is trying to convey anyway.

> +static __always_inline void lass_clac(void)
> +{
> +	alternative("", "clac", X86_FEATURE_LASS);
> +}
> +
> +static __always_inline void lass_stac(void)
> +{
> +	alternative("", "stac", X86_FEATURE_LASS);
> +}
> +

Not a strong opinion, but the naming of stac()/clac() lass_stac()/lass_clac() is
a bit confusing to me. stac/clac instructions now has LASS and SMAP behavior.
Why keep the smap behavior implicit and give LASS a special variant?

The other odd aspect is that calling stac()/clac() is needed for LASS in some
places too, right? But stac()/clac() depend on X86_FEATURE_SMAP not
X86_FEATURE_LASS. A reader might wonder, why do we not need the lass variant
there too.

I'd expect in the real world LASS won't be found without SMAP. Maybe it could be
worth just improving the comment around stac()/clac() to include some nod that
it is doing LASS stuff too, or that it relies on that USER mappings are only
found in the lower half, but KERNEL mappings are not only found upper half.

>  static __always_inline unsigned long smap_save(void)
>  {
>  	unsigned long flags;
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 79ae9cb50019..dc90b421d760 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -2409,16 +2409,30 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
>  __ro_after_init struct mm_struct *text_poke_mm;
>  __ro_after_init unsigned long text_poke_mm_addr;
>  
> +/*
> + * Text poking creates and uses a mapping in the lower half of the
> + * address space. Relax LASS enforcement when accessing the poking
> + * address.
> + *
> + * Also, objtool enforces a strict policy of "no function calls within
> + * AC=1 regions". Adhere to the policy by using inline versions of
> + * memcpy()/memset() that will never result in a function call.

Is "Also, ..." here really a separate issue? What is the connection to
lass_stac/clac()?

> + */
> +
>  static void text_poke_memcpy(void *dst, const void *src, size_t len)
>  {
> -	memcpy(dst, src, len);
> +	lass_stac();
> +	__inline_memcpy(dst, src, len);
> +	lass_clac();
>  }
>  
>  static void text_poke_memset(void *dst, const void *src, size_t len)
>  {
>  	int c = *(const int *)src;
>  
> -	memset(dst, c, len);
> +	lass_stac();
> +	__inline_memset(dst, c, len);
> +	lass_clac();
>  }
>  
>  typedef void text_poke_f(void *dst, const void *src, size_t len);

Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Sohil Mehta 4 months ago
On 10/7/2025 9:55 AM, Edgecombe, Rick P wrote:
> It's not just used for alternatives anymore. bpf, kprobes, etc use it too. Maybe
> drop "alternatives" from the subject?
> 

Yeah, I was just being lazy. The file is still called alternatives.c and
that's probably what most folks are familiar with.

How about:
x86/text-patching: Disable LASS when patching kernel code

> 
> The above variant has "a barrier is implicit in alternative", is it not needed
> here too? Actually, not sure what that comment is trying to convey anyway.
> 

Yes, the same implication holds true for the LASS versions as well.
I assume it is to let users know that a separate memory barrier is not
needed to prevent the memory accesses following the STAC/CLAC
instructions from getting reordered.

I will add a similar note to the lass_clac()/stac() comments as well.

> Not a strong opinion, but the naming of stac()/clac() lass_stac()/lass_clac() is
> a bit confusing to me. stac/clac instructions now has LASS and SMAP behavior.
> Why keep the smap behavior implicit and give LASS a special variant?
> 
> The other odd aspect is that calling stac()/clac() is needed for LASS in some
> places too, right? But stac()/clac() depend on X86_FEATURE_SMAP not
> X86_FEATURE_LASS. A reader might wonder, why do we not need the lass variant
> there too.
> 
> I'd expect in the real world LASS won't be found without SMAP. Maybe it could be
> worth just improving the comment around stac()/clac() to include some nod that
> it is doing LASS stuff too, or that it relies on that USER mappings are only
> found in the lower half, but KERNEL mappings are not only found upper half.
> 

LASS (data access) depends on SMAP in the hardware as well as the
kernel. The STAC/CLAC instructions toggle LASS alongwith SMAP. One
option is to use the current stac()/clac() instruction for all cases.
However, that would mean unnecessary AC bit toggling during
text-patching on systems without LASS.

The code comments mainly describe how these helpers should be used,
rather than why they exist the way they do.

>> /* Use stac()/clac() when accessing userspace (_PAGE_USER)
>> mappings, regardless of location. */
>> 
>> /* Use lass_stac()/lass_clac() when accessing kernel mappings (!
>> _PAGE_USER) in the lower half of the address space. */
Does this look accurate? The difference is subtle. Also, there is some
potential for incorrect usage, but Dave would prefer to track them
separately.

I can add more explanation to the commit message. Any preferred wording?
Also, the separate patch that Dave recommended would help clarify things
as well.

>>  
>> +/*
>> + * Text poking creates and uses a mapping in the lower half of the
>> + * address space. Relax LASS enforcement when accessing the poking
>> + * address.
>> + *
>> + * Also, objtool enforces a strict policy of "no function calls within
>> + * AC=1 regions". Adhere to the policy by using inline versions of
>> + * memcpy()/memset() that will never result in a function call.
> 
> Is "Also, ..." here really a separate issue? What is the connection to
> lass_stac/clac()?
> 

The issues are interdependent. We need the STAC/CLAC because text poking
accesses special memory. We require the inline memcpy/memset because we
have now added the STAC/CLAC usage and objtool guards against the
potential misuse of STAC/CLAC.

Were you looking for any specific change to the wording?

>>  static void text_poke_memcpy(void *dst, const void *src, size_t len)
>>  {
>> -	memcpy(dst, src, len);
>> +	lass_stac();
>> +	__inline_memcpy(dst, src, len);
>> +	lass_clac();
>>  }
>>
Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Edgecombe, Rick P 4 months ago
On Tue, 2025-10-07 at 15:28 -0700, Sohil Mehta wrote:
> On 10/7/2025 9:55 AM, Edgecombe, Rick P wrote:
> > It's not just used for alternatives anymore. bpf, kprobes, etc use it too. Maybe
> > drop "alternatives" from the subject?
> > 
> 
> Yeah, I was just being lazy. The file is still called alternatives.c and
> that's probably what most folks are familiar with.
> 
> How about:
> x86/text-patching: Disable LASS when patching kernel code

"x86/alternatives: Disable LASS when patching kernel alternatives"

I meant the last reference there ^. I think the "x86/alternatives:" part should
match the rest of the commits to the file.

"x86/alternatives: Disable LASS when patching kernel kernel code" sounds more
accurate to me.

> 
> > 
> > The above variant has "a barrier is implicit in alternative", is it not needed
> > here too? Actually, not sure what that comment is trying to convey anyway.
> > 
> 
> Yes, the same implication holds true for the LASS versions as well.
> I assume it is to let users know that a separate memory barrier is not
> needed to prevent the memory accesses following the STAC/CLAC
> instructions from getting reordered.
> 
> I will add a similar note to the lass_clac()/stac() comments as well.

Up to you.

> 
> > Not a strong opinion, but the naming of stac()/clac() lass_stac()/lass_clac() is
> > a bit confusing to me. stac/clac instructions now has LASS and SMAP behavior.
> > Why keep the smap behavior implicit and give LASS a special variant?
> > 
> > The other odd aspect is that calling stac()/clac() is needed for LASS in some
> > places too, right? But stac()/clac() depend on X86_FEATURE_SMAP not
> > X86_FEATURE_LASS. A reader might wonder, why do we not need the lass variant
> > there too.
> > 
> > I'd expect in the real world LASS won't be found without SMAP. Maybe it could be
> > worth just improving the comment around stac()/clac() to include some nod that
> > it is doing LASS stuff too, or that it relies on that USER mappings are only
> > found in the lower half, but KERNEL mappings are not only found upper half.
> > 
> 
> LASS (data access) depends on SMAP in the hardware as well as the
> kernel. The STAC/CLAC instructions toggle LASS alongwith SMAP. One
> option is to use the current stac()/clac() instruction for all cases.
> However, that would mean unnecessary AC bit toggling during
> text-patching on systems without LASS.

Honestly, just unconditionally doing stac/clac doesn't sound that bad to me. We
already unconditionally enable SMAP, right? If there was some big slowdown for a
single copy, people would want an option to disable it. And with text patching
it's part a heavier operation already.

Was there previous feedback on that option?

> 
> The code comments mainly describe how these helpers should be used,
> rather than why they exist the way they do.
> 
> > > /* Use stac()/clac() when accessing userspace (_PAGE_USER)
> > > mappings, regardless of location. */
> > > 
> > > /* Use lass_stac()/lass_clac() when accessing kernel mappings (!
> > > _PAGE_USER) in the lower half of the address space. */
> Does this look accurate? The difference is subtle. Also, there is some
> potential for incorrect usage, but Dave would prefer to track them
> separately.
> 
> I can add more explanation to the commit message. Any preferred wording?
> Also, the separate patch that Dave recommended would help clarify things
> as well.

No preference. The main comment was just that, as someone looking at this part
fresh, it was a little unclear. Especially with the non-obvious SMAP-LASS link.

> 
> > >  
> > > +/*
> > > + * Text poking creates and uses a mapping in the lower half of the
> > > + * address space. Relax LASS enforcement when accessing the poking
> > > + * address.
> > > + *
> > > + * Also, objtool enforces a strict policy of "no function calls within
> > > + * AC=1 regions". Adhere to the policy by using inline versions of
> > > + * memcpy()/memset() that will never result in a function call.
> > 
> > Is "Also, ..." here really a separate issue? What is the connection to
> > lass_stac/clac()?
> > 
> 
> The issues are interdependent. We need the STAC/CLAC because text poking
> accesses special memory. We require the inline memcpy/memset because we
> have now added the STAC/CLAC usage and objtool guards against the
> potential misuse of STAC/CLAC.
> 
> Were you looking for any specific change to the wording?

Ah ok, but the compiler could have always uninlined the existing memcpy calls
right? So there is an existing theoretical problem, I would think.

But that link sounds strong enough to do it in one patch. If it were me I would
nod at the existing theoretical issue.

> 
> > >  static void text_poke_memcpy(void *dst, const void *src, size_t len)
> > >  {
> > > -	memcpy(dst, src, len);
> > > +	lass_stac();
> > > +	__inline_memcpy(dst, src, len);
> > > +	lass_clac();
> > >  }
> > >  
> 
> 

Re: [PATCH v10 03/15] x86/alternatives: Disable LASS when patching kernel alternatives
Posted by Sohil Mehta 4 months ago
On 10/8/2025 9:22 AM, Edgecombe, Rick P wrote:
> Honestly, just unconditionally doing stac/clac doesn't sound that bad to me. We
> already unconditionally enable SMAP, right? If there was some big slowdown for a
> single copy, people would want an option to disable it. And with text patching
> it's part a heavier operation already.
> 
> Was there previous feedback on that option?
> 

Yes. Boris had expressed some concern about the extra toggles.

Dave and PeterZ mainly wanted to keep it separate for code isolation and
better understanding.

https://lore.kernel.org/lkml/7bbf9cae-6392-47a4-906c-7c27b1b1223d@intel.com/

I'll leave them as separate.
>> The issues are interdependent. We need the STAC/CLAC because text poking
>> accesses special memory. We require the inline memcpy/memset because we
>> have now added the STAC/CLAC usage and objtool guards against the
>> potential misuse of STAC/CLAC.
>>
>> Were you looking for any specific change to the wording?
> 
> Ah ok, but the compiler could have always uninlined the existing memcpy calls
> right? So there is an existing theoretical problem, I would think.
> 

What theoretical problem?

The existing text_poke_memcpy() is a wrapper over the kernel standard
memcpy(). That is an exported function call which shouldn't be inlined
(or uninlined), right?