[PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level

Ryan Roberts posted 16 patches 10 months, 1 week ago
[PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level
Posted by Ryan Roberts 10 months, 1 week ago
commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
spuriously try to invalidate at level 0 on LPA2-enabled systems.

Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
CONT_PTE_SIZE, which should provide a minor optimization.

Cc: <stable@vger.kernel.org>
Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 03db9cb21ace..8ab9542d2d22 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
 {
 	unsigned long stride = huge_page_size(hstate_vma(vma));
 
-	if (stride == PMD_SIZE)
-		__flush_tlb_range(vma, start, end, stride, false, 2);
-	else if (stride == PUD_SIZE)
-		__flush_tlb_range(vma, start, end, stride, false, 1);
-	else
-		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+	switch (stride) {
+	case PUD_SIZE:
+		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
+		break;
+	case CONT_PMD_SIZE:
+	case PMD_SIZE:
+		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
+		break;
+	case CONT_PTE_SIZE:
+		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
+		break;
+	default:
+		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
+	}
 }
 
 #endif /* __ASM_HUGETLB_H */
-- 
2.43.0
Re: [PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level
Posted by Anshuman Khandual 10 months, 1 week ago

On 2/5/25 20:39, Ryan Roberts wrote:
> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
> spuriously try to invalidate at level 0 on LPA2-enabled systems.
> 
> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
> CONT_PTE_SIZE, which should provide a minor optimization.
> 
> Cc: <stable@vger.kernel.org>
> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
> index 03db9cb21ace..8ab9542d2d22 100644
> --- a/arch/arm64/include/asm/hugetlb.h
> +++ b/arch/arm64/include/asm/hugetlb.h
> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>  {
>  	unsigned long stride = huge_page_size(hstate_vma(vma));
>  
> -	if (stride == PMD_SIZE)
> -		__flush_tlb_range(vma, start, end, stride, false, 2);
> -	else if (stride == PUD_SIZE)
> -		__flush_tlb_range(vma, start, end, stride, false, 1);
> -	else
> -		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
> +	switch (stride) {
> +	case PUD_SIZE:
> +		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
> +		break;

Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported()
checks also be added here for this PUD_SIZE case ?

> +	case CONT_PMD_SIZE:
> +	case PMD_SIZE:
> +		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
> +		break;
> +	case CONT_PTE_SIZE:
> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
> +		break;
> +	default:
> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
> +	}
>  }
>  
>  #endif /* __ASM_HUGETLB_H */
Re: [PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level
Posted by Ryan Roberts 10 months, 1 week ago
On 06/02/2025 06:46, Anshuman Khandual wrote:
> 
> 
> On 2/5/25 20:39, Ryan Roberts wrote:
>> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
>> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
>> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
>> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
>> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
>> spuriously try to invalidate at level 0 on LPA2-enabled systems.
>>
>> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
>> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
>> CONT_PTE_SIZE, which should provide a minor optimization.
>>
>> Cc: <stable@vger.kernel.org>
>> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>  arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
>> index 03db9cb21ace..8ab9542d2d22 100644
>> --- a/arch/arm64/include/asm/hugetlb.h
>> +++ b/arch/arm64/include/asm/hugetlb.h
>> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>>  {
>>  	unsigned long stride = huge_page_size(hstate_vma(vma));
>>  
>> -	if (stride == PMD_SIZE)
>> -		__flush_tlb_range(vma, start, end, stride, false, 2);
>> -	else if (stride == PUD_SIZE)
>> -		__flush_tlb_range(vma, start, end, stride, false, 1);
>> -	else
>> -		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
>> +	switch (stride) {
>> +	case PUD_SIZE:
>> +		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
>> +		break;
> 
> Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported()
> checks also be added here for this PUD_SIZE case ?

Yeah I guess so. TBH, it's never been entirely clear to me what the benefit is?
Is it just to remove (a tiny amount of) dead code when we know we don't support
blocks at the level? Or is there something more fundamental going on that I've
missed?

We seem to be quite inconsistent with the use of pud_sect_supported() in
hugetlbpage.c.

Anyway, I'll add this in, I guess it's preferable to follow the established pattern.

Thanks,
Ryan

> 
>> +	case CONT_PMD_SIZE:
>> +	case PMD_SIZE:
>> +		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
>> +		break;
>> +	case CONT_PTE_SIZE:
>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
>> +		break;
>> +	default:
>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
>> +	}
>>  }
>>  
>>  #endif /* __ASM_HUGETLB_H */
Re: [PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level
Posted by Anshuman Khandual 10 months ago

On 2/6/25 18:34, Ryan Roberts wrote:
> On 06/02/2025 06:46, Anshuman Khandual wrote:
>>
>>
>> On 2/5/25 20:39, Ryan Roberts wrote:
>>> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
>>> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
>>> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
>>> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
>>> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
>>> spuriously try to invalidate at level 0 on LPA2-enabled systems.
>>>
>>> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
>>> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
>>> CONT_PTE_SIZE, which should provide a minor optimization.
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> ---
>>>  arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
>>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
>>> index 03db9cb21ace..8ab9542d2d22 100644
>>> --- a/arch/arm64/include/asm/hugetlb.h
>>> +++ b/arch/arm64/include/asm/hugetlb.h
>>> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>>>  {
>>>  	unsigned long stride = huge_page_size(hstate_vma(vma));
>>>  
>>> -	if (stride == PMD_SIZE)
>>> -		__flush_tlb_range(vma, start, end, stride, false, 2);
>>> -	else if (stride == PUD_SIZE)
>>> -		__flush_tlb_range(vma, start, end, stride, false, 1);
>>> -	else
>>> -		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
>>> +	switch (stride) {
>>> +	case PUD_SIZE:
>>> +		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
>>> +		break;
>>
>> Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported()
>> checks also be added here for this PUD_SIZE case ?
> 
> Yeah I guess so. TBH, it's never been entirely clear to me what the benefit is?
> Is it just to remove (a tiny amount of) dead code when we know we don't support
> blocks at the level? Or is there something more fundamental going on that I've
> missed?

There is a generic fallback for PUD_SIZE in include/asm-generic/pgtable-nopud.h when
it is not defined on arm64 platform and pud_sect_supported() might also get optimized
by the compiler.

static inline bool pud_sect_supported(void)
{
        return PAGE_SIZE == SZ_4K;
}

IIUC this just saves dead code from being compiled as you mentioned.

> 
> We seem to be quite inconsistent with the use of pud_sect_supported() in
> hugetlbpage.c.

PUD_SIZE switch cases in hugetlb_mask_last_page() and arch_make_huge_pte() ? Those
should be fixed.

> 
> Anyway, I'll add this in, I guess it's preferable to follow the established pattern.

Agreed.

> 
> Thanks,
> Ryan
> 
>>
>>> +	case CONT_PMD_SIZE:
>>> +	case PMD_SIZE:
>>> +		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
>>> +		break;
>>> +	case CONT_PTE_SIZE:
>>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
>>> +		break;
>>> +	default:
>>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
>>> +	}
>>>  }
>>>  
>>>  #endif /* __ASM_HUGETLB_H */
>