[PATCH 2/3] fork: skip MTE tagging for kernel stacks

Muhammad Usama Anjum posted 3 patches 2 weeks, 3 days ago
There is a newer version of this series
[PATCH 2/3] fork: skip MTE tagging for kernel stacks
Posted by Muhammad Usama Anjum 2 weeks, 3 days ago
The stack pointer always uses the match-all tag, so MTE never checks
tags on stack accesses. Tagging stack memory on every thread creation
is pure overhead.

- Pass __GFP_SKIP_KASAN in gfp_mask for vmalloc-backed stacks so the
  vmalloc path skips HW tag setup (see previous patch).
- For the cached VMAP reuse path, skip kasan_unpoison_range() when HW
  tags are enabled since the memory will only be accessed through the
  match-all tagged SP.
- For the normal page allocator path, pass __GFP_SKIP_KASAN directly
  to the page allocator.

Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
---
 kernel/fork.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index bb0c2613a5604..2baf4db39b5a4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -345,7 +345,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
 		}
 
 		/* Reset stack metadata. */
-		kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
+		if (!kasan_hw_tags_enabled())
+			kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
 
 		stack = kasan_reset_tag(vm_area->addr);
 
@@ -358,7 +359,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
 	}
 
 	stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN,
-				     GFP_VMAP_STACK,
+				     GFP_VMAP_STACK | __GFP_SKIP_KASAN,
 				     node, __builtin_return_address(0));
 	if (!stack)
 		return -ENOMEM;
@@ -410,7 +411,8 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
 
 static int alloc_thread_stack_node(struct task_struct *tsk, int node)
 {
-	struct page *page = alloc_pages_node(node, THREADINFO_GFP,
+	struct page *page = alloc_pages_node(node,
+					     THREADINFO_GFP | __GFP_SKIP_KASAN,
 					     THREAD_SIZE_ORDER);
 
 	if (likely(page)) {
-- 
2.47.3
Re: [PATCH 2/3] fork: skip MTE tagging for kernel stacks
Posted by Ryan Roberts 2 weeks, 3 days ago
On 19/03/2026 11:49, Muhammad Usama Anjum wrote:
> The stack pointer always uses the match-all tag, so MTE never checks
> tags on stack accesses. Tagging stack memory on every thread creation
> is pure overhead.
> 
> - Pass __GFP_SKIP_KASAN in gfp_mask for vmalloc-backed stacks so the
>   vmalloc path skips HW tag setup (see previous patch).
> - For the cached VMAP reuse path, skip kasan_unpoison_range() when HW
>   tags are enabled since the memory will only be accessed through the
>   match-all tagged SP.
> - For the normal page allocator path, pass __GFP_SKIP_KASAN directly
>   to the page allocator.
> 
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
> ---
>  kernel/fork.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index bb0c2613a5604..2baf4db39b5a4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -345,7 +345,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>  		}
>  
>  		/* Reset stack metadata. */
> -		kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
> +		if (!kasan_hw_tags_enabled())
> +			kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
>  
>  		stack = kasan_reset_tag(vm_area->addr);
>  
> @@ -358,7 +359,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>  	}
>  
>  	stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN,
> -				     GFP_VMAP_STACK,
> +				     GFP_VMAP_STACK | __GFP_SKIP_KASAN,

Perhaps cleaner to include __GFP_SKIP_KASAN in GFP_VMAP_STACK ?

>  				     node, __builtin_return_address(0));
>  	if (!stack)
>  		return -ENOMEM;
> @@ -410,7 +411,8 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
>  
>  static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>  {
> -	struct page *page = alloc_pages_node(node, THREADINFO_GFP,
> +	struct page *page = alloc_pages_node(node,
> +					     THREADINFO_GFP | __GFP_SKIP_KASAN,

I think there are some other places that could benefit from __GFP_SKIP_KASAN;
see arm64's arch_alloc_vmap_stack(), which allocates stacks for efi, irq and
sdei. I think these are allocated at boot, so not really performance sensitive,
but we might as well be consistent?

You've also missed the alloc_thread_stack_node() implementation for !VMAP when
PAGE_SIZE > STACK_SIZE.

All of these sites use THREADINFO_GFP so perhaps it is better to just define
THREADINFO_GFP to include __GFP_SKIP_KASAN ?

Thanks,
Ryan


>  					     THREAD_SIZE_ORDER);
>  
>  	if (likely(page)) {
Re: [PATCH 2/3] fork: skip MTE tagging for kernel stacks
Posted by Muhammad Usama Anjum 2 weeks, 3 days ago
Hi Ryan,

Thank you for the review.

On 19/03/2026 12:09 pm, Ryan Roberts wrote:
> On 19/03/2026 11:49, Muhammad Usama Anjum wrote:
>> The stack pointer always uses the match-all tag, so MTE never checks
>> tags on stack accesses. Tagging stack memory on every thread creation
>> is pure overhead.
>>
>> - Pass __GFP_SKIP_KASAN in gfp_mask for vmalloc-backed stacks so the
>>   vmalloc path skips HW tag setup (see previous patch).
>> - For the cached VMAP reuse path, skip kasan_unpoison_range() when HW
>>   tags are enabled since the memory will only be accessed through the
>>   match-all tagged SP.
>> - For the normal page allocator path, pass __GFP_SKIP_KASAN directly
>>   to the page allocator.
>>
>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
>> ---
>>  kernel/fork.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index bb0c2613a5604..2baf4db39b5a4 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -345,7 +345,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>>  		}
>>  
>>  		/* Reset stack metadata. */
>> -		kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
>> +		if (!kasan_hw_tags_enabled())
>> +			kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
>>  
>>  		stack = kasan_reset_tag(vm_area->addr);
>>  
>> @@ -358,7 +359,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>>  	}
>>  
>>  	stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN,
>> -				     GFP_VMAP_STACK,
>> +				     GFP_VMAP_STACK | __GFP_SKIP_KASAN,
> 
> Perhaps cleaner to include __GFP_SKIP_KASAN in GFP_VMAP_STACK ?
Yes, it would be much better and the correct way. I'll add it in the next version.

> 
>>  				     node, __builtin_return_address(0));
>>  	if (!stack)
>>  		return -ENOMEM;
>> @@ -410,7 +411,8 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
>>  
>>  static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>>  {
>> -	struct page *page = alloc_pages_node(node, THREADINFO_GFP,
>> +	struct page *page = alloc_pages_node(node,
>> +					     THREADINFO_GFP | __GFP_SKIP_KASAN,
> 
> I think there are some other places that could benefit from __GFP_SKIP_KASAN;
> see arm64's arch_alloc_vmap_stack(), which allocates stacks for efi, irq and
> sdei. I think these are allocated at boot, so not really performance sensitive,
> but we might as well be consistent?
> 
> You've also missed the alloc_thread_stack_node() implementation for !VMAP when
> PAGE_SIZE > STACK_SIZE.
> 
> All of these sites use THREADINFO_GFP so perhaps it is better to just define
> THREADINFO_GFP to include __GFP_SKIP_KASAN ?
Yes, it'll be the straight forward and clean approach. I'll update.

> 
> Thanks,
> Ryan
> 
> 
>>  					     THREAD_SIZE_ORDER);
>>  
>>  	if (likely(page)) {
>