mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
When non-leaf pmd accessed bits are available, MGLRU page table walks
can clear the non-leaf pmd accessed bit and ignore the accessed bit on
the pte if it's on a different node, skipping a generation update as
well. If another scan occurs on the same node as said skipped pte.
the non-leaf pmd accessed bit might remain cleared and the pte accessed
bits won't be checked. While this is sufficient for reclaim-driven
aging, where the goal is to select a reasonably cold page, the access
can be missed when aging proactively for workingset estimation of a
node/memcg.
In more detail, get_pfn_folio returns NULL if the folio's nid != node
under scanning, so the page table walk skips processing of said pte. Now
the pmd_young flag on this pmd is cleared, and if none of the pte's are
accessed before another scan occurs on the folio's node, the pmd_young
check fails and the pte accessed bit is skipped.
Since force_scan disables various other optimizations, we check
force_scan to ignore the non-leaf pmd accessed bit.
Signed-off-by: Yuanchu Xie <yuanchu@google.com>
---
mm/vmscan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfa839284b92..4a112c2d1a64 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3476,7 +3476,7 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long addr, struct vm_area
goto next;
if (!pmd_trans_huge(pmd[i])) {
- if (should_clear_pmd_young())
+ if (!walk->force_scan && should_clear_pmd_young())
pmdp_test_and_clear_young(vma, addr, pmd + i);
goto next;
}
@@ -3563,7 +3563,7 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end,
walk->mm_stats[MM_NONLEAF_TOTAL]++;
- if (should_clear_pmd_young()) {
+ if (!walk->force_scan && should_clear_pmd_young()) {
if (!pmd_young(val))
continue;
--
2.46.0.76.ge559c4bf1a-goog
On Tue, Aug 13, 2024 at 10:38 AM Yuanchu Xie <yuanchu@google.com> wrote: > > When non-leaf pmd accessed bits are available, MGLRU page table walks > can clear the non-leaf pmd accessed bit and ignore the accessed bit on > the pte if it's on a different node, skipping a generation update as > well. If another scan occurs on the same node as said skipped pte. > the non-leaf pmd accessed bit might remain cleared and the pte accessed > bits won't be checked. While this is sufficient for reclaim-driven > aging, where the goal is to select a reasonably cold page, the access > can be missed when aging proactively for workingset estimation of a > node/memcg. > > In more detail, get_pfn_folio returns NULL if the folio's nid != node > under scanning, so the page table walk skips processing of said pte. Now > the pmd_young flag on this pmd is cleared, and if none of the pte's are > accessed before another scan occurs on the folio's node, the pmd_young > check fails and the pte accessed bit is skipped. > > Since force_scan disables various other optimizations, we check > force_scan to ignore the non-leaf pmd accessed bit. > > Signed-off-by: Yuanchu Xie <yuanchu@google.com> Acked-by: Yu Zhao <yuzhao@google.com>
© 2016 - 2026 Red Hat, Inc.