damon_get_folio() speculatively calls folio_test_lru() before
folio_try_get(). The folio can get freed and reallocated to a tail
page. In the case, VM_BUG_ON_PGFLAGS() in const_folio_flags() can be
triggered. Remove the speculative call.
Also do the folio_test_lru() check right after folio_try_get() success,
since it is more likely than folio realloc race.
The race should be rare. Also the problem can happen only if the kernel
has enabled CONFIG_DEBUG_VM_PGFLAGS. No real world report of this issue
has been made so far. This fix is based on only theoretical analysis.
That said, a bug is a bug. A similar issue was also fixed via commit
3203b3ab0fcf ("mm/filemap: don't call folio_test_locked() without a
reference in next_uptodate_folio()"). I don't expect this change will
make a meaningful impact to DAMON performance in the real world, though
I will be happy to be corrected from the real world reports.
The issue was discovered [1] by Sashiko.
[1] https://lore.kernel.org/20260517234112.89245-1-sj@kernel.org
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/damon/ops-common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
index 3a0ddc3ac7196..d3404615f9b75 100644
--- a/mm/damon/ops-common.c
+++ b/mm/damon/ops-common.c
@@ -32,9 +32,9 @@ struct folio *damon_get_folio(unsigned long pfn)
return NULL;
folio = page_folio(page);
- if (!folio_test_lru(folio) || !folio_try_get(folio))
+ if (!folio_try_get(folio))
return NULL;
- if (unlikely(page_folio(page) != folio || !folio_test_lru(folio))) {
+ if (!folio_test_lru(folio) || unlikely(page_folio(page) != folio)) {
folio_put(folio);
folio = NULL;
}
base-commit: a94d68c2dfd523cebb2755787fb01c08eef70c43
--
2.47.3