[PATCH v2 2/2] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE

Kefeng Wang posted 2 patches 2 years, 6 months ago
[PATCH v2 2/2] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Kefeng Wang 2 years, 6 months ago
It is better to use huge page size instead of PAGE_SIZE
for stride when flush hugepage, which reduces the loop
in __flush_tlb_range().

Let's support arch's flush_hugetlb_tlb_range(), which is
used in hugetlb_unshare_all_pmds(), move_hugetlb_page_tables()
and hugetlb_change_protection() for now.

Note, for hugepages based on contiguous bit, it has to be
invalidated individually since the contiguous PTE bit is
just a hint, the hardware may or may not take it into account.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/include/asm/hugetlb.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 6a4a1ab8eb23..e5c2e3dd9cf0 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -60,4 +60,16 @@ extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
 
 #include <asm-generic/hugetlb.h>
 
+#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
+static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+					   unsigned long start,
+					   unsigned long end)
+{
+	unsigned long stride = huge_page_size(hstate_vma(vma));
+
+	if (stride != PMD_SIZE && stride != PUD_SIZE)
+		stride = PAGE_SIZE;
+	__flush_tlb_range(vma, start, end, stride, false, 0);
+}
+
 #endif /* __ASM_HUGETLB_H */
-- 
2.41.0
Re: [PATCH v2 2/2] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Catalin Marinas 2 years, 6 months ago
On Tue, Aug 01, 2023 at 10:31:45AM +0800, Kefeng Wang wrote:
> +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
> +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
> +					   unsigned long start,
> +					   unsigned long end)
> +{
> +	unsigned long stride = huge_page_size(hstate_vma(vma));
> +
> +	if (stride != PMD_SIZE && stride != PUD_SIZE)
> +		stride = PAGE_SIZE;
> +	__flush_tlb_range(vma, start, end, stride, false, 0);

We could use some hints here for the tlb_level (2 for pmd, 1 for pud).
Regarding the last_level argument to __flush_tlb_range(), I think it
needs to stay false since this function is also called on the
hugetlb_unshare_pmds() path where the pud is cleared and needs
invalidating.

That said, maybe you can rewrite it as a switch statement and call
flush_pmd_tlb_range() or flush_pud_tlb_range() (just make sure these are
defined when CONFIG_HUGETLBFS is enabled).

-- 
Catalin
Re: [PATCH v2 2/2] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Kefeng Wang 2 years, 6 months ago

On 2023/8/1 19:09, Catalin Marinas wrote:
> On Tue, Aug 01, 2023 at 10:31:45AM +0800, Kefeng Wang wrote:
>> +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
>> +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>> +					   unsigned long start,
>> +					   unsigned long end)
>> +{
>> +	unsigned long stride = huge_page_size(hstate_vma(vma));
>> +
>> +	if (stride != PMD_SIZE && stride != PUD_SIZE)
>> +		stride = PAGE_SIZE;
>> +	__flush_tlb_range(vma, start, end, stride, false, 0);
> 
> We could use some hints here for the tlb_level (2 for pmd, 1 for pud).
> Regarding the last_level argument to __flush_tlb_range(), I think it
> needs to stay false since this function is also called on the
> hugetlb_unshare_pmds() path where the pud is cleared and needs
> invalidating.
>  > That said, maybe you can rewrite it as a switch statement and call
> flush_pmd_tlb_range() or flush_pud_tlb_range() (just make sure these are
> defined when CONFIG_HUGETLBFS is enabled).
> 

How about this way, not involved with thp ?

diff --git a/arch/arm64/include/asm/hugetlb.h 
b/arch/arm64/include/asm/hugetlb.h
index e5c2e3dd9cf0..a7ce59d3388e 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -66,10 +66,22 @@ static inline void flush_hugetlb_tlb_range(struct 
vm_area_struct *vma,
                                            unsigned long end)
  {
         unsigned long stride = huge_page_size(hstate_vma(vma));
+       int tlb_level = 0;

-       if (stride != PMD_SIZE && stride != PUD_SIZE)
+       switch (stride) {
+#ifndef __PAGETABLE_PMD_FOLDED
+       case PUD_SIZE:
+               tlb_level = 1;
+               break;
+#endif
+       case PMD_SIZE:
+               tlb_level = 2;
+               break;
+       default:
                 stride = PAGE_SIZE;
-       __flush_tlb_range(vma, start, end, stride, false, 0);
+       }
+
+       __flush_tlb_range(vma, start, end, stride, false, tlb_level);
  }

Re: [PATCH v2 2/2] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Kefeng Wang 2 years, 6 months ago

On 2023/8/1 19:22, Kefeng Wang wrote:
> 
> 
> On 2023/8/1 19:09, Catalin Marinas wrote:
>> On Tue, Aug 01, 2023 at 10:31:45AM +0800, Kefeng Wang wrote:
>>> +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
>>> +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>>> +                       unsigned long start,
>>> +                       unsigned long end)
>>> +{
>>> +    unsigned long stride = huge_page_size(hstate_vma(vma));
>>> +
>>> +    if (stride != PMD_SIZE && stride != PUD_SIZE)
>>> +        stride = PAGE_SIZE;
>>> +    __flush_tlb_range(vma, start, end, stride, false, 0);
>>
>> We could use some hints here for the tlb_level (2 for pmd, 1 for pud).
>> Regarding the last_level argument to __flush_tlb_range(), I think it
>> needs to stay false since this function is also called on the
>> hugetlb_unshare_pmds() path where the pud is cleared and needs
>> invalidating.
>>  > That said, maybe you can rewrite it as a switch statement and call
>> flush_pmd_tlb_range() or flush_pud_tlb_range() (just make sure these are
>> defined when CONFIG_HUGETLBFS is enabled).

I try the way your mentioned, it won't change much, will send v3, thanks.

Re: [PATCH v2 2/2] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Muchun Song 2 years, 6 months ago

> On Aug 1, 2023, at 10:31, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> It is better to use huge page size instead of PAGE_SIZE
> for stride when flush hugepage, which reduces the loop
> in __flush_tlb_range().
> 
> Let's support arch's flush_hugetlb_tlb_range(), which is
> used in hugetlb_unshare_all_pmds(), move_hugetlb_page_tables()
> and hugetlb_change_protection() for now.
> 
> Note, for hugepages based on contiguous bit, it has to be
> invalidated individually since the contiguous PTE bit is
> just a hint, the hardware may or may not take it into account.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>
[PATCH v3] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Kefeng Wang 2 years, 6 months ago
It is better to use huge page size instead of PAGE_SIZE
for stride when flush hugepage, which reduces the loop
in __flush_tlb_range().

Let's support arch's flush_hugetlb_tlb_range(), which is
used in hugetlb_unshare_all_pmds(), move_hugetlb_page_tables()
and hugetlb_change_protection() for now.

Note, for hugepages based on contiguous bit, it has to be
invalidated individually since the contiguous PTE bit is
just a hint, the hardware may or may not take it into account.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
v3: add tlb_level hint by using flush_pud/pmd_tlb_range,
    suggested by Catalin Marinas

 arch/arm64/include/asm/hugetlb.h | 21 +++++++++++++++++++++
 arch/arm64/include/asm/pgtable.h |  4 ++--
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 6a4a1ab8eb23..0acb1e8b41e9 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -60,4 +60,25 @@ extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
 
 #include <asm-generic/hugetlb.h>
 
+#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
+static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+					   unsigned long start,
+					   unsigned long end)
+{
+	unsigned long stride = huge_page_size(hstate_vma(vma));
+
+	switch (stride) {
+#ifndef __PAGETABLE_PMD_FOLDED
+	case PUD_SIZE:
+		flush_pud_tlb_range(vma, start, end);
+		break;
+#endif
+	case PMD_SIZE:
+		flush_pmd_tlb_range(vma, start, end);
+		break;
+	default:
+		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+	}
+}
+
 #endif /* __ASM_HUGETLB_H */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 0bd18de9fd97..def402afcbe9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -35,7 +35,7 @@
 #include <linux/sched.h>
 #include <linux/page_table_check.h>
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLB_PAGE)
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
 
 /* Set stride and tlb_level in flush_*_tlb_range */
@@ -43,7 +43,7 @@
 	__flush_tlb_range(vma, addr, end, PMD_SIZE, false, 2)
 #define flush_pud_tlb_range(vma, addr, end)	\
 	__flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLB_PAGE */
 
 static inline bool arch_thp_swp_supported(void)
 {
-- 
2.41.0
Re: [PATCH v3] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Catalin Marinas 2 years, 6 months ago
On Tue, Aug 01, 2023 at 09:56:16PM +0800, Kefeng Wang wrote:
> +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
> +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
> +					   unsigned long start,
> +					   unsigned long end)
> +{
> +	unsigned long stride = huge_page_size(hstate_vma(vma));
> +
> +	switch (stride) {
> +#ifndef __PAGETABLE_PMD_FOLDED
> +	case PUD_SIZE:
> +		flush_pud_tlb_range(vma, start, end);
> +		break;
> +#endif
> +	case PMD_SIZE:
> +		flush_pmd_tlb_range(vma, start, end);
> +		break;
> +	default:
> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
> +	}
> +}

I think we should be consistent and either use __flush_tlb_range()
everywhere or flush_p*d_tlb_range() together with flush_tlb_range().
Maybe using __flush_tlb_range() for the pmd/pud is not too bad, smaller
patch.

That said, I'd avoid the #ifndef and just go for an if/else statement:

	if (stride == PMD_SIZE)
		__flush_tlb_range(vma, start, end, stride, false, 2);
	else if (stride == PUD_SIZE)
		__flush_tlb_range(vma, start, end, stride, false, 1);
	else
		__flush_tlb_range(vma, start, end, PAGE_SIZE, 0);

With the pmd folded, the P*D_SIZE is the same and the compiler should
eliminate the second branch.

-- 
Catalin
[PATCH v4] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Kefeng Wang 2 years, 6 months ago
It is better to use huge page size instead of PAGE_SIZE
for stride when flush hugepage, which reduces the loop
in __flush_tlb_range().

Let's support arch's flush_hugetlb_tlb_range(), which is
used in hugetlb_unshare_all_pmds(), move_hugetlb_page_tables()
and hugetlb_change_protection() for now.

Note, for hugepages based on contiguous bit, it has to be
invalidated individually since the contiguous PTE bit is
just a hint, the hardware may or may not take it into account.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
v4: directly pass tlb_level to __flush_tlb_range() with PMD/PUD size,
    suggested by Catalin
v3: add tlb_level hint by using flush_pud/pmd_tlb_range, 
    suggested by Catalin

 arch/arm64/include/asm/hugetlb.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 6a4a1ab8eb23..a91d6219aa78 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -60,4 +60,19 @@ extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
 
 #include <asm-generic/hugetlb.h>
 
+#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
+static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
+					   unsigned long start,
+					   unsigned long end)
+{
+	unsigned long stride = huge_page_size(hstate_vma(vma));
+
+	if (stride == PMD_SIZE)
+		__flush_tlb_range(vma, start, end, stride, false, 2);
+	else if (stride == PUD_SIZE)
+		__flush_tlb_range(vma, start, end, stride, false, 1);
+	else
+		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+}
+
 #endif /* __ASM_HUGETLB_H */
-- 
2.41.0
Re: [PATCH v4] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Catalin Marinas 2 years, 6 months ago
On Wed, Aug 02, 2023 at 09:27:31AM +0800, Kefeng Wang wrote:
> It is better to use huge page size instead of PAGE_SIZE
> for stride when flush hugepage, which reduces the loop
> in __flush_tlb_range().
> 
> Let's support arch's flush_hugetlb_tlb_range(), which is
> used in hugetlb_unshare_all_pmds(), move_hugetlb_page_tables()
> and hugetlb_change_protection() for now.
> 
> Note, for hugepages based on contiguous bit, it has to be
> invalidated individually since the contiguous PTE bit is
> just a hint, the hardware may or may not take it into account.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Re: [PATCH v4] arm64: hugetlb: enable __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
Posted by Muchun Song 2 years, 6 months ago

> On Aug 2, 2023, at 09:27, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> It is better to use huge page size instead of PAGE_SIZE
> for stride when flush hugepage, which reduces the loop
> in __flush_tlb_range().
> 
> Let's support arch's flush_hugetlb_tlb_range(), which is
> used in hugetlb_unshare_all_pmds(), move_hugetlb_page_tables()
> and hugetlb_change_protection() for now.
> 
> Note, for hugepages based on contiguous bit, it has to be
> invalidated individually since the contiguous PTE bit is
> just a hint, the hardware may or may not take it into account.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>