pagemap_scan_thp_entry() splits a huge PMD when the PAGEMAP_SCAN ioctl
needs to write-protect only a portion of a THP. It then returns -ENOENT
so pagemap_scan_pmd_entry() falls through to PTE-level handling.
Check the split_huge_pmd() return value and propagate the error on
failure. Returning -ENOMEM instead of -ENOENT prevents the fallthrough
to PTE handling, and the error propagates through walk_page_range() to
do_pagemap_scan() where it becomes the ioctl return value.
pagemap_scan_backout_range() already undoes the buffered output, and
walk_end is written back to userspace so the caller knows where the
scan stopped.
If the split fails, the PMD remains huge. An alternative to the approach
in the patch is to return -ENOENT, causing the caller to proceed to
pte_offset_map_lock(). ___pte_offset_map() detects the trans_huge PMD
and returns NULL, which sets ACTION_AGAIN — restarting the walker on the
same PMD by which time the system might have enough memory to satisfy
the split from succeeding.
Signed-off-by: Usama Arif <usama.arif@linux.dev>
---
fs/proc/task_mmu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e091931d7ca19..f5f459140b5c0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2714,9 +2714,13 @@ static int pagemap_scan_thp_entry(pmd_t *pmd, unsigned long start,
* needs to be performed on a portion of the huge page.
*/
if (end != start + HPAGE_SIZE) {
+ int err;
+
spin_unlock(ptl);
- split_huge_pmd(vma, pmd, start);
+ err = split_huge_pmd(vma, pmd, start);
pagemap_scan_backout_range(p, start, end);
+ if (err)
+ return err;
/* Report as if there was no THP */
return -ENOENT;
}
--
2.52.0