The stack pointer always uses the match-all tag, so MTE never checks
tags on stack accesses. Tagging stack memory on every thread creation
is pure overhead.
- Pass __GFP_SKIP_KASAN in gfp_mask for vmalloc-backed stacks so the
vmalloc path skips HW tag setup (see previous patch).
- For the cached VMAP reuse path, skip kasan_unpoison_range() when HW
tags are enabled since the memory will only be accessed through the
match-all tagged SP.
- For the normal page allocator path, pass __GFP_SKIP_KASAN directly
to the page allocator.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
---
kernel/fork.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index bb0c2613a5604..2baf4db39b5a4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -345,7 +345,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
}
/* Reset stack metadata. */
- kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
+ if (!kasan_hw_tags_enabled())
+ kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
stack = kasan_reset_tag(vm_area->addr);
@@ -358,7 +359,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
}
stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN,
- GFP_VMAP_STACK,
+ GFP_VMAP_STACK | __GFP_SKIP_KASAN,
node, __builtin_return_address(0));
if (!stack)
return -ENOMEM;
@@ -410,7 +411,8 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
static int alloc_thread_stack_node(struct task_struct *tsk, int node)
{
- struct page *page = alloc_pages_node(node, THREADINFO_GFP,
+ struct page *page = alloc_pages_node(node,
+ THREADINFO_GFP | __GFP_SKIP_KASAN,
THREAD_SIZE_ORDER);
if (likely(page)) {
--
2.47.3
On 19/03/2026 11:49, Muhammad Usama Anjum wrote:
> The stack pointer always uses the match-all tag, so MTE never checks
> tags on stack accesses. Tagging stack memory on every thread creation
> is pure overhead.
>
> - Pass __GFP_SKIP_KASAN in gfp_mask for vmalloc-backed stacks so the
> vmalloc path skips HW tag setup (see previous patch).
> - For the cached VMAP reuse path, skip kasan_unpoison_range() when HW
> tags are enabled since the memory will only be accessed through the
> match-all tagged SP.
> - For the normal page allocator path, pass __GFP_SKIP_KASAN directly
> to the page allocator.
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
> ---
> kernel/fork.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index bb0c2613a5604..2baf4db39b5a4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -345,7 +345,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
> }
>
> /* Reset stack metadata. */
> - kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
> + if (!kasan_hw_tags_enabled())
> + kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
>
> stack = kasan_reset_tag(vm_area->addr);
>
> @@ -358,7 +359,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
> }
>
> stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN,
> - GFP_VMAP_STACK,
> + GFP_VMAP_STACK | __GFP_SKIP_KASAN,
Perhaps cleaner to include __GFP_SKIP_KASAN in GFP_VMAP_STACK ?
> node, __builtin_return_address(0));
> if (!stack)
> return -ENOMEM;
> @@ -410,7 +411,8 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
>
> static int alloc_thread_stack_node(struct task_struct *tsk, int node)
> {
> - struct page *page = alloc_pages_node(node, THREADINFO_GFP,
> + struct page *page = alloc_pages_node(node,
> + THREADINFO_GFP | __GFP_SKIP_KASAN,
I think there are some other places that could benefit from __GFP_SKIP_KASAN;
see arm64's arch_alloc_vmap_stack(), which allocates stacks for efi, irq and
sdei. I think these are allocated at boot, so not really performance sensitive,
but we might as well be consistent?
You've also missed the alloc_thread_stack_node() implementation for !VMAP when
PAGE_SIZE > STACK_SIZE.
All of these sites use THREADINFO_GFP so perhaps it is better to just define
THREADINFO_GFP to include __GFP_SKIP_KASAN ?
Thanks,
Ryan
> THREAD_SIZE_ORDER);
>
> if (likely(page)) {
Hi Ryan,
Thank you for the review.
On 19/03/2026 12:09 pm, Ryan Roberts wrote:
> On 19/03/2026 11:49, Muhammad Usama Anjum wrote:
>> The stack pointer always uses the match-all tag, so MTE never checks
>> tags on stack accesses. Tagging stack memory on every thread creation
>> is pure overhead.
>>
>> - Pass __GFP_SKIP_KASAN in gfp_mask for vmalloc-backed stacks so the
>> vmalloc path skips HW tag setup (see previous patch).
>> - For the cached VMAP reuse path, skip kasan_unpoison_range() when HW
>> tags are enabled since the memory will only be accessed through the
>> match-all tagged SP.
>> - For the normal page allocator path, pass __GFP_SKIP_KASAN directly
>> to the page allocator.
>>
>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
>> ---
>> kernel/fork.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index bb0c2613a5604..2baf4db39b5a4 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -345,7 +345,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>> }
>>
>> /* Reset stack metadata. */
>> - kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
>> + if (!kasan_hw_tags_enabled())
>> + kasan_unpoison_range(vm_area->addr, THREAD_SIZE);
>>
>> stack = kasan_reset_tag(vm_area->addr);
>>
>> @@ -358,7 +359,7 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>> }
>>
>> stack = __vmalloc_node(THREAD_SIZE, THREAD_ALIGN,
>> - GFP_VMAP_STACK,
>> + GFP_VMAP_STACK | __GFP_SKIP_KASAN,
>
> Perhaps cleaner to include __GFP_SKIP_KASAN in GFP_VMAP_STACK ?
Yes, it would be much better and the correct way. I'll add it in the next version.
>
>> node, __builtin_return_address(0));
>> if (!stack)
>> return -ENOMEM;
>> @@ -410,7 +411,8 @@ static void thread_stack_delayed_free(struct task_struct *tsk)
>>
>> static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>> {
>> - struct page *page = alloc_pages_node(node, THREADINFO_GFP,
>> + struct page *page = alloc_pages_node(node,
>> + THREADINFO_GFP | __GFP_SKIP_KASAN,
>
> I think there are some other places that could benefit from __GFP_SKIP_KASAN;
> see arm64's arch_alloc_vmap_stack(), which allocates stacks for efi, irq and
> sdei. I think these are allocated at boot, so not really performance sensitive,
> but we might as well be consistent?
>
> You've also missed the alloc_thread_stack_node() implementation for !VMAP when
> PAGE_SIZE > STACK_SIZE.
>
> All of these sites use THREADINFO_GFP so perhaps it is better to just define
> THREADINFO_GFP to include __GFP_SKIP_KASAN ?
Yes, it'll be the straight forward and clean approach. I'll update.
>
> Thanks,
> Ryan
>
>
>> THREAD_SIZE_ORDER);
>>
>> if (likely(page)) {
>
© 2016 - 2026 Red Hat, Inc.