[PATCH v1 2/7] mm: make zap_pte_range() handle full within-PMD range

Qi Zheng posted 7 patches 1 month, 1 week ago
There is a newer version of this series
[PATCH v1 2/7] mm: make zap_pte_range() handle full within-PMD range
Posted by Qi Zheng 1 month, 1 week ago
In preparation for reclaiming empty PTE pages, this commit first makes
zap_pte_range() to handle the full within-PMD range, so that we can more
easily detect and free PTE pages in this function in subsequent commits.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 mm/memory.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index caa6ed0a7fe5b..fd57c0f49fce2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1602,6 +1602,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
 	swp_entry_t entry;
 	int nr;
 
+retry:
 	tlb_change_page_size(tlb, PAGE_SIZE);
 	init_rss_vec(rss);
 	start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
@@ -1706,6 +1707,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
 	if (force_flush)
 		tlb_flush_mmu(tlb);
 
+	if (addr != end) {
+		cond_resched();
+		goto retry;
+	}
+
 	return addr;
 }
 
@@ -1744,8 +1750,6 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb,
 			continue;
 		}
 		addr = zap_pte_range(tlb, vma, pmd, addr, next, details);
-		if (addr != next)
-			pmd--;
 	} while (pmd++, cond_resched(), addr != end);
 
 	return addr;
-- 
2.20.1
Re: [PATCH v1 2/7] mm: make zap_pte_range() handle full within-PMD range
Posted by Jann Horn 1 month, 1 week ago
On Thu, Oct 17, 2024 at 11:48 AM Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> In preparation for reclaiming empty PTE pages, this commit first makes
> zap_pte_range() to handle the full within-PMD range, so that we can more
> easily detect and free PTE pages in this function in subsequent commits.

I think your patch causes some unintended difference in behavior:

> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> ---
>  mm/memory.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index caa6ed0a7fe5b..fd57c0f49fce2 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1602,6 +1602,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
>         swp_entry_t entry;
>         int nr;
>
> +retry:

This "retry" label is below the line "bool force_flush = false,
force_break = false;", so I think after force_break is set once and
you go through the retry path, every subsequent present PTE will again
bail out and retry. I think that doesn't lead to anything bad, but it
seems unintended.
Re: [PATCH v1 2/7] mm: make zap_pte_range() handle full within-PMD range
Posted by Qi Zheng 1 month, 1 week ago

On 2024/10/18 02:06, Jann Horn wrote:
> On Thu, Oct 17, 2024 at 11:48 AM Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>> In preparation for reclaiming empty PTE pages, this commit first makes
>> zap_pte_range() to handle the full within-PMD range, so that we can more
>> easily detect and free PTE pages in this function in subsequent commits.
> 
> I think your patch causes some unintended difference in behavior:
> 
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> ---
>>   mm/memory.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index caa6ed0a7fe5b..fd57c0f49fce2 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -1602,6 +1602,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
>>          swp_entry_t entry;
>>          int nr;
>>
>> +retry:
> 
> This "retry" label is below the line "bool force_flush = false,
> force_break = false;", so I think after force_break is set once and
> you go through the retry path, every subsequent present PTE will again
> bail out and retry. I think that doesn't lead to anything bad, but it
> seems unintended.

Right, thanks for catching this! Will set force_flush and force_break to
false under "retry" label in v2.

Thanks!