[PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end

Carl Worth posted 2 patches 3 months, 1 week ago
[PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Carl Worth 3 months, 1 week ago
The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
tag checking to be disabled. The TCO bit is automatically set by the
hardware when an exception is taken.

Prior to this commit, mte_disable_tco_entry would clear TCO (enable
tag checking) for either of two cases: 1. When the kernel wants tag
checking (KASAN) or 2. when userspace wants tag checking (via
SCTLR.TCF0).

In the case of userspace desired tag checking, (that is, when KASAN is
off), clearing TCO on entry to the kernel has negative performance
implications. This results in excess kernel space tag checking that
has not been requested.

For this case, move the clearing of TCO to user_space_access_begin,
and set it again in user_access_end. This restricts the tag checking
to only the duration of the userspace accesses as desired.

This patch has been measured to eliminate over 97% of kernel-side tag
checking during "perf bench futex hash"

Reported-by: Taehyun Noh <taehyun@utexas.edu>
Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
---
 arch/arm64/include/asm/mte.h     | 21 +++++++++++++--------
 arch/arm64/include/asm/uaccess.h | 32 +++++++++++++++++++++++++++++++-
 2 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 70dabc884616..3608ba452da5 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -258,15 +258,20 @@ static inline void set_kernel_mte_policy(struct task_struct *task)
 		return;
 
 	/*
-	 * Re-enable tag checking (TCO set on exception entry). This is only
-	 * necessary if MTE is enabled in either the kernel or the userspace
-	 * task. With MTE disabled in the kernel and disabled or asynchronous
-	 * in userspace, tag check faults (including in uaccesses) are not
-	 * reported, therefore there is no need to re-enable checking.
-	 * This is beneficial on microarchitectures where re-enabling TCO is
-	 * expensive.
+	 * TCO is set on exception entry, (which overrides either of TCF
+	 * or TCF0 and disables tag checking).
+	 *
+	 * If KASAN is enabled and using MTE/(aka "hw_tags") we clear
+	 * TCO so that the kernel gets the tag-checking it needs for
+	 * KASAN_HW_TAGS.
+	 *
+	 * When the kernel needs to enable tag-checking temporarily,
+	 * (such as before accessing userspace memory in the case that
+	 * userspace has requested tag checking), the kernel can
+	 * temporarily change the state of TCO. See
+	 * user_access_begin().
 	 */
-	if (kasan_hw_tags_enabled() || user_uses_tagcheck())
+	if (kasan_hw_tags_enabled())
 		asm volatile(SET_PSTATE_TCO(0));
 }
 
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 1aa4ecb73429..248741a66c91 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
 {
 	if (unlikely(!access_ok(ptr,len)))
 		return 0;
+
+	/*
+	 * Enable tag checking for the user access if MTE is enabled
+	 * in the userspace task.
+	 *
+	 * Note: We don't need to do anything if KASAN is enabled,
+	 * since that means the tag checking override (TCO) will
+	 * already be disabled. In turn, the TCF0 bits will control
+	 * whether user-space tag checking happens .
+	 */
+	if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
+		asm volatile(SET_PSTATE_TCO(0));
+
 	uaccess_ttbr0_enable();
 	return 1;
 }
+
+static __always_inline void user_access_end(void)
+{
+	/*
+	 * Restore TCO to disable tag checking now that user access is done.
+	 *
+	 * This logic uses the identical condition as in user_access_begin
+	 * to avoid writing PSTATE.TCO with a value identical to what it
+	 * already has (which would needlessly introduce a pipeline flush
+	 * and could impact performance).
+	 */
+	if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
+		asm volatile(SET_PSTATE_TCO(1));
+
+	uaccess_ttbr0_disable();
+}
+
 #define user_access_begin(a,b)	user_access_begin(a,b)
-#define user_access_end()	uaccess_ttbr0_disable()
+#define user_access_end()	user_access_end()
 #define unsafe_put_user(x, ptr, label) \
 	__raw_put_mem("sttr", x, uaccess_mask_ptr(ptr), label, U)
 #define unsafe_get_user(x, ptr, label) \

-- 
2.39.5
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Will Deacon 1 month ago
On Thu, Oct 30, 2025 at 08:49:32PM -0700, Carl Worth wrote:
> The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
> tag checking to be disabled. The TCO bit is automatically set by the
> hardware when an exception is taken.
> 
> Prior to this commit, mte_disable_tco_entry would clear TCO (enable
> tag checking) for either of two cases: 1. When the kernel wants tag
> checking (KASAN) or 2. when userspace wants tag checking (via
> SCTLR.TCF0).
> 
> In the case of userspace desired tag checking, (that is, when KASAN is
> off), clearing TCO on entry to the kernel has negative performance
> implications. This results in excess kernel space tag checking that
> has not been requested.
> 
> For this case, move the clearing of TCO to user_space_access_begin,
> and set it again in user_access_end. This restricts the tag checking
> to only the duration of the userspace accesses as desired.
> 
> This patch has been measured to eliminate over 97% of kernel-side tag
> checking during "perf bench futex hash"
> 
> Reported-by: Taehyun Noh <taehyun@utexas.edu>
> Signed-off-by: Carl Worth <carl@os.amperecomputing.com>
> ---
>  arch/arm64/include/asm/mte.h     | 21 +++++++++++++--------
>  arch/arm64/include/asm/uaccess.h | 32 +++++++++++++++++++++++++++++++-
>  2 files changed, 44 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 70dabc884616..3608ba452da5 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -258,15 +258,20 @@ static inline void set_kernel_mte_policy(struct task_struct *task)
>  		return;
>  
>  	/*
> -	 * Re-enable tag checking (TCO set on exception entry). This is only
> -	 * necessary if MTE is enabled in either the kernel or the userspace
> -	 * task. With MTE disabled in the kernel and disabled or asynchronous
> -	 * in userspace, tag check faults (including in uaccesses) are not
> -	 * reported, therefore there is no need to re-enable checking.
> -	 * This is beneficial on microarchitectures where re-enabling TCO is
> -	 * expensive.
> +	 * TCO is set on exception entry, (which overrides either of TCF
> +	 * or TCF0 and disables tag checking).
> +	 *
> +	 * If KASAN is enabled and using MTE/(aka "hw_tags") we clear
> +	 * TCO so that the kernel gets the tag-checking it needs for
> +	 * KASAN_HW_TAGS.
> +	 *
> +	 * When the kernel needs to enable tag-checking temporarily,
> +	 * (such as before accessing userspace memory in the case that
> +	 * userspace has requested tag checking), the kernel can
> +	 * temporarily change the state of TCO. See
> +	 * user_access_begin().
>  	 */
> -	if (kasan_hw_tags_enabled() || user_uses_tagcheck())
> +	if (kasan_hw_tags_enabled())
>  		asm volatile(SET_PSTATE_TCO(0));
>  }
>  
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 1aa4ecb73429..248741a66c91 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
>  {
>  	if (unlikely(!access_ok(ptr,len)))
>  		return 0;
> +
> +	/*
> +	 * Enable tag checking for the user access if MTE is enabled
> +	 * in the userspace task.
> +	 *
> +	 * Note: We don't need to do anything if KASAN is enabled,
> +	 * since that means the tag checking override (TCO) will
> +	 * already be disabled. In turn, the TCF0 bits will control
> +	 * whether user-space tag checking happens .
> +	 */
> +	if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
> +		asm volatile(SET_PSTATE_TCO(0));
> +
>  	uaccess_ttbr0_enable();
>  	return 1;
>  }

What about all the uaccess routines that don't call user_access_begin? For
example, copy_from_user().

Will
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Catalin Marinas 1 month ago
On Thu, Jan 08, 2026 at 03:06:30PM +0000, Will Deacon wrote:
> On Thu, Oct 30, 2025 at 08:49:32PM -0700, Carl Worth wrote:
> > The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
> > tag checking to be disabled. The TCO bit is automatically set by the
> > hardware when an exception is taken.
> > 
> > Prior to this commit, mte_disable_tco_entry would clear TCO (enable
> > tag checking) for either of two cases: 1. When the kernel wants tag
> > checking (KASAN) or 2. when userspace wants tag checking (via
> > SCTLR.TCF0).
> > 
> > In the case of userspace desired tag checking, (that is, when KASAN is
> > off), clearing TCO on entry to the kernel has negative performance
> > implications. This results in excess kernel space tag checking that
> > has not been requested.

I would have expected the hardware to avoid any tag checking if
SCTLR_EL1.TCF is 0. I guess the Arm ARM isn't entirely clear (D10.4.1
Tag Checked memory accesses), it seems to only mention TCF and TCMA with
a match-all tag for considering Unchecked accesses.

> > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > index 1aa4ecb73429..248741a66c91 100644
> > --- a/arch/arm64/include/asm/uaccess.h
> > +++ b/arch/arm64/include/asm/uaccess.h
> > @@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
> >  {
> >  	if (unlikely(!access_ok(ptr,len)))
> >  		return 0;
> > +
> > +	/*
> > +	 * Enable tag checking for the user access if MTE is enabled
> > +	 * in the userspace task.
> > +	 *
> > +	 * Note: We don't need to do anything if KASAN is enabled,
> > +	 * since that means the tag checking override (TCO) will
> > +	 * already be disabled. In turn, the TCF0 bits will control
> > +	 * whether user-space tag checking happens .
> > +	 */
> > +	if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
> > +		asm volatile(SET_PSTATE_TCO(0));
> > +
> >  	uaccess_ttbr0_enable();
> >  	return 1;
> >  }
> 
> What about all the uaccess routines that don't call user_access_begin? For
> example, copy_from_user().

We might as well ignore tag checking for all uaccess for specific
hardware. It's a relaxation but you get this with futex already and some
combination of read/write() syscalls with O_DIRECT.


Reading the Arm ARM section again, I wonder whether always setting TCMA1
does the trick for the Ampere hardware. With KASAN disabled in the
kernel, all addresses will star with 0xff... so behave as match-all. We
do this with KASAN_HW_TAGS enabled but it won't have any effect with
kasan disabled.

Carl, could you please try the patch below?

----------------8<----------------------------------------
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 01e868116448..8b1f0de00fd3 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -48,14 +48,14 @@
 #define TCR_KASAN_SW_FLAGS 0
 #endif
 
-#ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
-#elif defined(CONFIG_ARM64_MTE)
+#ifdef CONFIG_ARM64_MTE
 /*
  * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
- * TBI being enabled at EL1.
+ * TBI being enabled at EL1. TCMA1 is needed to treat accesses with the
+ * match-all tag (0xF) as Tag Unchecked, irrespective of the SCTLR_EL1.TCF
+ * setting.
  */
-#define TCR_MTE_FLAGS TCR_EL1_TBI1 | TCR_EL1_TBID1
+#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
 #else
 #define TCR_MTE_FLAGS 0
 #endif
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Taehyun Noh 1 month ago
Hi,

On Thu Jan 8, 2026 at 12:45 PM CST, Catalin Marinas wrote:
> Reading the Arm ARM section again, I wonder whether always setting TCMA1
> does the trick for the Ampere hardware. With KASAN disabled in the
> kernel, all addresses will star with 0xff... so behave as match-all. We
> do this with KASAN_HW_TAGS enabled but it won't have any effect with
> kasan disabled.

Our team agrees with Catalin’s TCMA1 solution. It disables every kernel
tag checking but the user address will get tag checked as far as TCO is
clear. Also, Carl’s initial testing confirms that
`mem_access_checked*:k` counters drop with the TCMA1 patch. While we
haven’t run the memcached benchmark yet, we will follow up with those
results shortly.

Additionally, we’ve observed that Pixel 9 behaves differently; the
kernel does not perform any tag checking when the user process enables
MTE. I’ve tested a simple kernel module that accesses kernel memory on
user ioctl, and measured the MTE perf counters on both AmpereOne and
Pixel 9. Pixel 9 shows no increases in checked access counters, but
AmpereOne shows proportional increases depending on the buffer size that
is accessed inside the kernel module.

We will keep you posted as more data becomes available.

Taehyun
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Catalin Marinas 1 month ago
On Fri, Jan 09, 2026 at 11:29:29PM -0600, Taehyun Noh wrote:
> On Thu Jan 8, 2026 at 12:45 PM CST, Catalin Marinas wrote:
> > Reading the Arm ARM section again, I wonder whether always setting TCMA1
> > does the trick for the Ampere hardware. With KASAN disabled in the
> > kernel, all addresses will star with 0xff... so behave as match-all. We
> > do this with KASAN_HW_TAGS enabled but it won't have any effect with
> > kasan disabled.
> 
> Our team agrees with Catalin’s TCMA1 solution. It disables every kernel
> tag checking but the user address will get tag checked as far as TCO is
> clear. Also, Carl’s initial testing confirms that
> `mem_access_checked*:k` counters drop with the TCMA1 patch. While we
> haven’t run the memcached benchmark yet, we will follow up with those
> results shortly.

That's great. Carl, could you please respin the patch with just setting
the TCMA1 bit? Just add a suggested-by me (I could post the patch as
well but I don't have the data to back it up and include in the commit
log).

> Additionally, we’ve observed that Pixel 9 behaves differently; the
> kernel does not perform any tag checking when the user process enables
> MTE. I’ve tested a simple kernel module that accesses kernel memory on
> user ioctl, and measured the MTE perf counters on both AmpereOne and
> Pixel 9. Pixel 9 shows no increases in checked access counters, but
> AmpereOne shows proportional increases depending on the buffer size that
> is accessed inside the kernel module.

It's an implementation choice. I think the Arm Ltd CPUs ignore tag
checking if SCTLR_EL1.TCF==0, irrespective of TCMA1 or TCO. But always
setting TCMA1 is completely harmless and it's covered by the text in the
Arm ARM.

-- 
Catalin
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Carl Worth 3 weeks, 5 days ago
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Fri, Jan 09, 2026 at 11:29:29PM -0600, Taehyun Noh wrote:
>> Our team agrees with Catalin’s TCMA1 solution. It disables every kernel
>> tag checking but the user address will get tag checked as far as TCO is
>> clear. Also, Carl’s initial testing confirms that
>> `mem_access_checked*:k` counters drop with the TCMA1 patch. While we
>> haven’t run the memcached benchmark yet, we will follow up with those
>> results shortly.
>
> That's great. Carl, could you please respin the patch with just setting
> the TCMA1 bit? Just add a suggested-by me (I could post the patch as
> well but I don't have the data to back it up and include in the commit
> log).

Will do. I'm just running the final benchmark numbers and then will send
this out.

-Carl
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Will Deacon 1 month ago
On Thu, Jan 08, 2026 at 06:45:30PM +0000, Catalin Marinas wrote:
> On Thu, Jan 08, 2026 at 03:06:30PM +0000, Will Deacon wrote:
> > On Thu, Oct 30, 2025 at 08:49:32PM -0700, Carl Worth wrote:
> > > The PSTATE.TCO (Tag Checking Override) register, when set causes MTE
> > > tag checking to be disabled. The TCO bit is automatically set by the
> > > hardware when an exception is taken.
> > > 
> > > Prior to this commit, mte_disable_tco_entry would clear TCO (enable
> > > tag checking) for either of two cases: 1. When the kernel wants tag
> > > checking (KASAN) or 2. when userspace wants tag checking (via
> > > SCTLR.TCF0).
> > > 
> > > In the case of userspace desired tag checking, (that is, when KASAN is
> > > off), clearing TCO on entry to the kernel has negative performance
> > > implications. This results in excess kernel space tag checking that
> > > has not been requested.
> 
> I would have expected the hardware to avoid any tag checking if
> SCTLR_EL1.TCF is 0. I guess the Arm ARM isn't entirely clear (D10.4.1
> Tag Checked memory accesses), it seems to only mention TCF and TCMA with
> a match-all tag for considering Unchecked accesses.
> 
> > > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > > index 1aa4ecb73429..248741a66c91 100644
> > > --- a/arch/arm64/include/asm/uaccess.h
> > > +++ b/arch/arm64/include/asm/uaccess.h
> > > @@ -417,11 +417,41 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
> > >  {
> > >  	if (unlikely(!access_ok(ptr,len)))
> > >  		return 0;
> > > +
> > > +	/*
> > > +	 * Enable tag checking for the user access if MTE is enabled
> > > +	 * in the userspace task.
> > > +	 *
> > > +	 * Note: We don't need to do anything if KASAN is enabled,
> > > +	 * since that means the tag checking override (TCO) will
> > > +	 * already be disabled. In turn, the TCF0 bits will control
> > > +	 * whether user-space tag checking happens .
> > > +	 */
> > > +	if (!kasan_hw_tags_enabled() && user_uses_tagcheck())
> > > +		asm volatile(SET_PSTATE_TCO(0));
> > > +
> > >  	uaccess_ttbr0_enable();
> > >  	return 1;
> > >  }
> > 
> > What about all the uaccess routines that don't call user_access_begin? For
> > example, copy_from_user().
> 
> We might as well ignore tag checking for all uaccess for specific
> hardware. It's a relaxation but you get this with futex already and some
> combination of read/write() syscalls with O_DIRECT.

Hmm, you could argue it's an ABI break, no? You can write a userspace
program that will behave differently before and after the change.
Conversely, you could argue that a syscall using uaccess is an
unstable implementation detail of the syscall, but it feels a bit fragile
(for example, signal delivery is always going to use the uaccess routines
to access the signal stack)

Will
Re: [PATCH 2/2] arm64: mte: Defer disabling of TCO until user_access_begin/end
Posted by Carl Worth 1 month ago
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Thu, Jan 08, 2026 at 03:06:30PM +0000, Will Deacon wrote:
>> 
>> What about all the uaccess routines that don't call user_access_begin? For
>> example, copy_from_user().

It's possible I missed some code paths here. Thanks for pointing that
out.

> We might as well ignore tag checking for all uaccess for specific
> hardware. It's a relaxation but you get this with futex already and some
> combination of read/write() syscalls with O_DIRECT.

I'm not sure I agree with that. I mean, you could argue that since the
current implementation doesn't guarantee all uaccess gets tag checking
we have cover for skipping tag checking in other cases.

But I think the system is strictly better if we prefer to have kernel
uaccess use tag checking wherever possible.

> Reading the Arm ARM section again, I wonder whether always setting TCMA1
> does the trick for the Ampere hardware. With KASAN disabled in the
> kernel, all addresses will star with 0xff... so behave as match-all. We
> do this with KASAN_HW_TAGS enabled but it won't have any effect with
> kasan disabled.

I'm not familiar with any "match-all" semantics associated with a
tag-value of 0xf. Maybe I'm missing something?

But I'm clearly not aware of everything regarding MTE, since TCMA1 was
new to me too.

Having read up on it now, I agree it looks like a good approach to try
addressing the performance problem here. And this would let us leave the
TCO handling as-is so we could skip past Will's two concerns above,
(potential performance slowdown to other uses cases than what I've
reported on, and potential code paths where I missed the toggling of
TCO).

> Carl, could you please try the patch below?

I'll do that and report back here soon.

Thanks,

-Carl

> ----------------8<----------------------------------------
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 01e868116448..8b1f0de00fd3 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -48,14 +48,14 @@
>  #define TCR_KASAN_SW_FLAGS 0
>  #endif
>  
> -#ifdef CONFIG_KASAN_HW_TAGS
> -#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
> -#elif defined(CONFIG_ARM64_MTE)
> +#ifdef CONFIG_ARM64_MTE
>  /*
>   * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
> - * TBI being enabled at EL1.
> + * TBI being enabled at EL1. TCMA1 is needed to treat accesses with the
> + * match-all tag (0xF) as Tag Unchecked, irrespective of the SCTLR_EL1.TCF
> + * setting.
>   */
> -#define TCR_MTE_FLAGS TCR_EL1_TBI1 | TCR_EL1_TBID1
> +#define TCR_MTE_FLAGS TCR_EL1_TCMA1 | TCR_EL1_TBI1 | TCR_EL1_TBID1
>  #else
>  #define TCR_MTE_FLAGS 0
>  #endif