mm/khugepaged.c | 9 +- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 2 + .../selftests/mm/ksft_madv_collapse.sh | 4 + .../selftests/mm/madv_collapse_range.c | 141 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 5 + 6 files changed, 159 insertions(+), 3 deletions(-) create mode 100755 tools/testing/selftests/mm/ksft_madv_collapse.sh create mode 100644 tools/testing/selftests/mm/madv_collapse_range.c
madvise_collapse() computes a THP-aligned window from the caller's range: hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK /* round up */ hend = end & HPAGE_PMD_MASK /* round down */ When the caller's range is smaller than one PMD (2 MiB) and/or not PMD-aligned, hstart can end up greater than hend. In that case the collapsing loop is correctly skipped, but the return value was computed as ((hend - hstart) >> HPAGE_PMD_SHIFT): with hstart > hend the subtraction wraps unsigned, producing a huge value, the comparison "thps != 0" fires, and -EINVAL is returned instead of 0. A concrete example: /* both cover less than one THP; both should return 0 */ madvise(aligned, PAGE_SIZE, MADV_COLLAPSE); /* OK, returns 0 */ madvise(aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); /* returns -EINVAL */ The fix moves the hstart/hend calculation before kmalloc_obj() and returns 0 early when hstart >= hend. This also avoids the kmalloc, mmgrab(), and lru_add_drain_all() calls for ranges that trivially contain no PMD window. The same effect could be achieved by only guarding the final return expression, but early-return keeps the no-op path free of the allocator and drain overhead. Patch 1 fixes the kernel bug. Patch 2 adds a selftest with two cases covering the hstart == hend (aligned, was already correct) and hstart > hend (unaligned, was broken) scenarios. Chen Wandun (2): mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range selftests/mm: add MADV_COLLAPSE sub-PMD range tests mm/khugepaged.c | 9 +- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 2 + .../selftests/mm/ksft_madv_collapse.sh | 4 + .../selftests/mm/madv_collapse_range.c | 141 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 5 + 6 files changed, 159 insertions(+), 3 deletions(-) create mode 100755 tools/testing/selftests/mm/ksft_madv_collapse.sh create mode 100644 tools/testing/selftests/mm/madv_collapse_range.c -- 2.43.0
Hi, scripts/get_maintainer.pl is your friend :) Please use it to Cc the relevant maintainers and reviewers next time. Cheers, Lance On Thu, May 07, 2026 at 03:05:56PM +0800, Chen Wandun wrote: >madvise_collapse() computes a THP-aligned window from the caller's range: > > hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK /* round up */ > hend = end & HPAGE_PMD_MASK /* round down */ > >When the caller's range is smaller than one PMD (2 MiB) and/or not >PMD-aligned, hstart can end up greater than hend. In that case the >collapsing loop is correctly skipped, but the return value was computed >as ((hend - hstart) >> HPAGE_PMD_SHIFT): with hstart > hend the >subtraction wraps unsigned, producing a huge value, the comparison >"thps != 0" fires, and -EINVAL is returned instead of 0. > >A concrete example: > > /* both cover less than one THP; both should return 0 */ > madvise(aligned, PAGE_SIZE, MADV_COLLAPSE); /* OK, returns 0 */ > madvise(aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); /* returns -EINVAL */ > >The fix moves the hstart/hend calculation before kmalloc_obj() and >returns 0 early when hstart >= hend. This also avoids the kmalloc, >mmgrab(), and lru_add_drain_all() calls for ranges that trivially >contain no PMD window. The same effect could be achieved by only >guarding the final return expression, but early-return keeps the >no-op path free of the allocator and drain overhead. > >Patch 1 fixes the kernel bug. >Patch 2 adds a selftest with two cases covering the hstart == hend >(aligned, was already correct) and hstart > hend (unaligned, was >broken) scenarios. > >Chen Wandun (2): > mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range > selftests/mm: add MADV_COLLAPSE sub-PMD range tests > > mm/khugepaged.c | 9 +- > tools/testing/selftests/mm/.gitignore | 1 + > tools/testing/selftests/mm/Makefile | 2 + > .../selftests/mm/ksft_madv_collapse.sh | 4 + > .../selftests/mm/madv_collapse_range.c | 141 ++++++++++++++++++ > tools/testing/selftests/mm/run_vmtests.sh | 5 + > 6 files changed, 159 insertions(+), 3 deletions(-) > create mode 100755 tools/testing/selftests/mm/ksft_madv_collapse.sh > create mode 100644 tools/testing/selftests/mm/madv_collapse_range.c > >-- >2.43.0 > >
On 5/9/26 17:47, Lance Yang wrote: > Hi, > > scripts/get_maintainer.pl is your friend :) > Please use it to Cc the relevant maintainers and reviewers next time. Many thanks for your kind reminder :) I will do it next time. Best regards, Wandun > > Cheers, Lance > > On Thu, May 07, 2026 at 03:05:56PM +0800, Chen Wandun wrote: >> madvise_collapse() computes a THP-aligned window from the caller's range: >> >> hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK /* round up */ >> hend = end & HPAGE_PMD_MASK /* round down */ >> >> When the caller's range is smaller than one PMD (2 MiB) and/or not >> PMD-aligned, hstart can end up greater than hend. In that case the >> collapsing loop is correctly skipped, but the return value was computed >> as ((hend - hstart) >> HPAGE_PMD_SHIFT): with hstart > hend the >> subtraction wraps unsigned, producing a huge value, the comparison >> "thps != 0" fires, and -EINVAL is returned instead of 0. >> >> A concrete example: >> >> /* both cover less than one THP; both should return 0 */ >> madvise(aligned, PAGE_SIZE, MADV_COLLAPSE); /* OK, returns 0 */ >> madvise(aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); /* returns -EINVAL */ >> >> The fix moves the hstart/hend calculation before kmalloc_obj() and >> returns 0 early when hstart >= hend. This also avoids the kmalloc, >> mmgrab(), and lru_add_drain_all() calls for ranges that trivially >> contain no PMD window. The same effect could be achieved by only >> guarding the final return expression, but early-return keeps the >> no-op path free of the allocator and drain overhead. >> >> Patch 1 fixes the kernel bug. >> Patch 2 adds a selftest with two cases covering the hstart == hend >> (aligned, was already correct) and hstart > hend (unaligned, was >> broken) scenarios. >> >> Chen Wandun (2): >> mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range >> selftests/mm: add MADV_COLLAPSE sub-PMD range tests >> >> mm/khugepaged.c | 9 +- >> tools/testing/selftests/mm/.gitignore | 1 + >> tools/testing/selftests/mm/Makefile | 2 + >> .../selftests/mm/ksft_madv_collapse.sh | 4 + >> .../selftests/mm/madv_collapse_range.c | 141 ++++++++++++++++++ >> tools/testing/selftests/mm/run_vmtests.sh | 5 + >> 6 files changed, 159 insertions(+), 3 deletions(-) >> create mode 100755 tools/testing/selftests/mm/ksft_madv_collapse.sh >> create mode 100644 tools/testing/selftests/mm/madv_collapse_range.c >> >> -- >> 2.43.0 >> >>
© 2016 - 2026 Red Hat, Inc.