[PATCH v4 07/18] mm: x86: Untag addresses in EXECMEM_ROX related pointer arithmetic

Maciej Wieczor-Retman posted 18 patches 1 month, 3 weeks ago
There is a newer version of this series
[PATCH v4 07/18] mm: x86: Untag addresses in EXECMEM_ROX related pointer arithmetic
Posted by Maciej Wieczor-Retman 1 month, 3 weeks ago
ARCH_HAS_EXECMEM_ROX was re-enabled in x86 at Linux 6.14 release.
Related code has multiple spots where page virtual addresses end up used
as arguments in arithmetic operations. Combined with enabled tag-based
KASAN it can result in pointers that don't point where they should or
logical operations not giving expected results.

vm_reset_perms() calculates range's start and end addresses using min()
and max() functions. To do that it compares pointers but some are not
tagged - addr variable is, start and end variables aren't.

within() and within_range() can receive tagged addresses which get
compared to untagged start and end variables.

Reset tags in addresses used as function arguments in min(), max(),
within() and within_range().

execmem_cache_add() adds tagged pointers to a maple tree structure,
which then are incorrectly compared when walking the tree. That results
in different pointers being returned later and page permission violation
errors panicking the kernel.

Reset tag of the address range inserted into the maple tree inside
execmem_cache_add().

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
Changelog v4:
- Add patch to the series.

 arch/x86/mm/pat/set_memory.c | 1 +
 mm/execmem.c                 | 4 +++-
 mm/vmalloc.c                 | 4 ++--
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 8834c76f91c9..1f14a1297db0 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -222,6 +222,7 @@ static inline void cpa_inc_lp_preserved(int level) { }
 static inline int
 within(unsigned long addr, unsigned long start, unsigned long end)
 {
+	addr = (unsigned long)kasan_reset_tag((void *)addr);
 	return addr >= start && addr < end;
 }
 
diff --git a/mm/execmem.c b/mm/execmem.c
index 0822305413ec..743fa4a8c069 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -191,6 +191,8 @@ static int execmem_cache_add_locked(void *ptr, size_t size, gfp_t gfp_mask)
 	unsigned long lower, upper;
 	void *area = NULL;
 
+	addr = arch_kasan_reset_tag(addr);
+
 	lower = addr;
 	upper = addr + size - 1;
 
@@ -216,7 +218,7 @@ static int execmem_cache_add(void *ptr, size_t size, gfp_t gfp_mask)
 static bool within_range(struct execmem_range *range, struct ma_state *mas,
 			 size_t size)
 {
-	unsigned long addr = mas->index;
+	unsigned long addr = arch_kasan_reset_tag(mas->index);
 
 	if (addr >= range->start && addr + size < range->end)
 		return true;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6dbcdceecae1..83d666e4837a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3328,8 +3328,8 @@ static void vm_reset_perms(struct vm_struct *area)
 			unsigned long page_size;
 
 			page_size = PAGE_SIZE << page_order;
-			start = min(addr, start);
-			end = max(addr + page_size, end);
+			start = min((unsigned long)arch_kasan_reset_tag(addr), start);
+			end = max((unsigned long)arch_kasan_reset_tag(addr) + page_size, end);
 			flush_dmap = 1;
 		}
 	}
-- 
2.50.1
Re: [PATCH v4 07/18] mm: x86: Untag addresses in EXECMEM_ROX related pointer arithmetic
Posted by Mike Rapoport 1 month, 3 weeks ago
On Tue, Aug 12, 2025 at 03:23:43PM +0200, Maciej Wieczor-Retman wrote:
> ARCH_HAS_EXECMEM_ROX was re-enabled in x86 at Linux 6.14 release.
> Related code has multiple spots where page virtual addresses end up used
> as arguments in arithmetic operations. Combined with enabled tag-based
> KASAN it can result in pointers that don't point where they should or
> logical operations not giving expected results.
> 
> vm_reset_perms() calculates range's start and end addresses using min()
> and max() functions. To do that it compares pointers but some are not
> tagged - addr variable is, start and end variables aren't.
> 
> within() and within_range() can receive tagged addresses which get
> compared to untagged start and end variables.
> 
> Reset tags in addresses used as function arguments in min(), max(),
> within() and within_range().
> 
> execmem_cache_add() adds tagged pointers to a maple tree structure,
> which then are incorrectly compared when walking the tree. That results
> in different pointers being returned later and page permission violation
> errors panicking the kernel.
> 
> Reset tag of the address range inserted into the maple tree inside
> execmem_cache_add().
> 
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
> Changelog v4:
> - Add patch to the series.
> 
>  arch/x86/mm/pat/set_memory.c | 1 +
>  mm/execmem.c                 | 4 +++-
>  mm/vmalloc.c                 | 4 ++--
>  3 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 8834c76f91c9..1f14a1297db0 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -222,6 +222,7 @@ static inline void cpa_inc_lp_preserved(int level) { }
>  static inline int
>  within(unsigned long addr, unsigned long start, unsigned long end)
>  {
> +	addr = (unsigned long)kasan_reset_tag((void *)addr);
>  	return addr >= start && addr < end;
>  }
>  
> diff --git a/mm/execmem.c b/mm/execmem.c
> index 0822305413ec..743fa4a8c069 100644
> --- a/mm/execmem.c
> +++ b/mm/execmem.c
> @@ -191,6 +191,8 @@ static int execmem_cache_add_locked(void *ptr, size_t size, gfp_t gfp_mask)
>  	unsigned long lower, upper;
>  	void *area = NULL;
>  
> +	addr = arch_kasan_reset_tag(addr);

Shouldn't this use kasan_reset_tag()?
And the calls below as well?

Also this can be done when addr is initialized 

> +
>  	lower = addr;
>  	upper = addr + size - 1;
>  
> @@ -216,7 +218,7 @@ static int execmem_cache_add(void *ptr, size_t size, gfp_t gfp_mask)
>  static bool within_range(struct execmem_range *range, struct ma_state *mas,
>  			 size_t size)
>  {
> -	unsigned long addr = mas->index;
> +	unsigned long addr = arch_kasan_reset_tag(mas->index);

AFAIU, we use plain address without the tag as an index in
execmem_cache_add(), so here mas->index will be a plain address as well
  
>  	if (addr >= range->start && addr + size < range->end)
>  		return true;
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6dbcdceecae1..83d666e4837a 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3328,8 +3328,8 @@ static void vm_reset_perms(struct vm_struct *area)
>  			unsigned long page_size;
>  
>  			page_size = PAGE_SIZE << page_order;
> -			start = min(addr, start);
> -			end = max(addr + page_size, end);
> +			start = min((unsigned long)arch_kasan_reset_tag(addr), start);
> +			end = max((unsigned long)arch_kasan_reset_tag(addr) + page_size, end);
>  			flush_dmap = 1;
>  		}
>  	}
> -- 
> 2.50.1
> 

-- 
Sincerely yours,
Mike.
Re: [PATCH v4 07/18] mm: x86: Untag addresses in EXECMEM_ROX related pointer arithmetic
Posted by Maciej Wieczor-Retman 1 month, 2 weeks ago
On 2025-08-14 at 10:26:19 +0300, Mike Rapoport wrote:
>On Tue, Aug 12, 2025 at 03:23:43PM +0200, Maciej Wieczor-Retman wrote:
>> ARCH_HAS_EXECMEM_ROX was re-enabled in x86 at Linux 6.14 release.
>> Related code has multiple spots where page virtual addresses end up used
>> as arguments in arithmetic operations. Combined with enabled tag-based
>> KASAN it can result in pointers that don't point where they should or
>> logical operations not giving expected results.
>> 
>> vm_reset_perms() calculates range's start and end addresses using min()
>> and max() functions. To do that it compares pointers but some are not
>> tagged - addr variable is, start and end variables aren't.
>> 
>> within() and within_range() can receive tagged addresses which get
>> compared to untagged start and end variables.
>> 
>> Reset tags in addresses used as function arguments in min(), max(),
>> within() and within_range().
>> 
>> execmem_cache_add() adds tagged pointers to a maple tree structure,
>> which then are incorrectly compared when walking the tree. That results
>> in different pointers being returned later and page permission violation
>> errors panicking the kernel.
>> 
>> Reset tag of the address range inserted into the maple tree inside
>> execmem_cache_add().
>> 
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>> ---
>> Changelog v4:
>> - Add patch to the series.
>> 
>>  arch/x86/mm/pat/set_memory.c | 1 +
>>  mm/execmem.c                 | 4 +++-
>>  mm/vmalloc.c                 | 4 ++--
>>  3 files changed, 6 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
>> index 8834c76f91c9..1f14a1297db0 100644
>> --- a/arch/x86/mm/pat/set_memory.c
>> +++ b/arch/x86/mm/pat/set_memory.c
>> @@ -222,6 +222,7 @@ static inline void cpa_inc_lp_preserved(int level) { }
>>  static inline int
>>  within(unsigned long addr, unsigned long start, unsigned long end)
>>  {
>> +	addr = (unsigned long)kasan_reset_tag((void *)addr);
>>  	return addr >= start && addr < end;
>>  }
>>  
>> diff --git a/mm/execmem.c b/mm/execmem.c
>> index 0822305413ec..743fa4a8c069 100644
>> --- a/mm/execmem.c
>> +++ b/mm/execmem.c
>> @@ -191,6 +191,8 @@ static int execmem_cache_add_locked(void *ptr, size_t size, gfp_t gfp_mask)
>>  	unsigned long lower, upper;
>>  	void *area = NULL;
>>  
>> +	addr = arch_kasan_reset_tag(addr);
>
>Shouldn't this use kasan_reset_tag()?
>And the calls below as well?

Yes, my mistake, the kernel bot pointed that out for me too :b.

>
>Also this can be done when addr is initialized 

Sure, I'll do that there.

>
>> +
>>  	lower = addr;
>>  	upper = addr + size - 1;
>>  
>> @@ -216,7 +218,7 @@ static int execmem_cache_add(void *ptr, size_t size, gfp_t gfp_mask)
>>  static bool within_range(struct execmem_range *range, struct ma_state *mas,
>>  			 size_t size)
>>  {
>> -	unsigned long addr = mas->index;
>> +	unsigned long addr = arch_kasan_reset_tag(mas->index);
>
>AFAIU, we use plain address without the tag as an index in
>execmem_cache_add(), so here mas->index will be a plain address as well

I'll recheck to make sure but I had some unspecific errors such as "page
permission violation". So I thought a page address must be picked incorrectly
somewhere due to tagging. After revising most places where there is pointer
arithmetic / comparisons and printing these addresses I found some were tagged
in within_range().

But I'll recheck if my other changes didn't make this line redundant. I added
this first which fixed some issues but then I found more which were fixed by
resetting addr in execmem_cache_add_locked().

>  
>>  	if (addr >= range->start && addr + size < range->end)
>>  		return true;
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 6dbcdceecae1..83d666e4837a 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -3328,8 +3328,8 @@ static void vm_reset_perms(struct vm_struct *area)
>>  			unsigned long page_size;
>>  
>>  			page_size = PAGE_SIZE << page_order;
>> -			start = min(addr, start);
>> -			end = max(addr + page_size, end);
>> +			start = min((unsigned long)arch_kasan_reset_tag(addr), start);
>> +			end = max((unsigned long)arch_kasan_reset_tag(addr) + page_size, end);
>>  			flush_dmap = 1;
>>  		}
>>  	}
>> -- 
>> 2.50.1
>> 
>
>-- 
>Sincerely yours,
>Mike.

-- 
Kind regards
Maciej Wieczór-Retman