The nr_hugepgs variable is used to keep the original nr_hugepages at the
hugepage setup step at test beginning. After userfaultfd test, a cleaup is
executed, both /sys/kernel/mm/hugepages/hugepages-*/nr_hugepages and
/proc/sys//vm/nr_hugepages are reset to 'original' value before userfaultfd
test starts.
Issue here is the value used to restore /proc/sys/vm/nr_hugepages is
nr_hugepgs which is the initial value before the vm_runtests.sh runs, not
the value before userfaultfd test starts. 'va_high_addr_swith.sh' tests
runs after that will possibly see no hugepages available for test, and got
EINVAL when mmap(HUGETLB), making the result invalid.
And before pkey tests, nr_hugepgs is changed to be used as a temp variable
to save nr_hugepages before pkey test, and restore it after pkey tests
finish. The original nr_hugepages value is not tracked anymore, so no way
to restore it after all tests finish.
Add a new variable orig_nr_hugepgs to save the original nr_hugepages, and
and restore it to nr_hugepages after all tests finish. And change to use
the nr_hugepgs variable to save the /proc/sys/vm/nr_hugeages after hugepage
setup, it's also the value before userfaultfd test starts, and the correct
value to be restored after userfaultfd finishes. The va_high_addr_switch.sh
broken will be resolved.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
---
Changes in v2
- rename nr_hugepgs_origin to orig_nr_hugepgs
---
tools/testing/selftests/mm/run_vmtests.sh | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh
index 471e539d82b8..9866e4221bc2 100755
--- a/tools/testing/selftests/mm/run_vmtests.sh
+++ b/tools/testing/selftests/mm/run_vmtests.sh
@@ -172,13 +172,13 @@ fi
# set proper nr_hugepages
if [ -n "$freepgs" ] && [ -n "$hpgsize_KB" ]; then
- nr_hugepgs=$(cat /proc/sys/vm/nr_hugepages)
+ orig_nr_hugepgs=$(cat /proc/sys/vm/nr_hugepages)
needpgs=$((needmem_KB / hpgsize_KB))
tries=2
while [ "$tries" -gt 0 ] && [ "$freepgs" -lt "$needpgs" ]; do
lackpgs=$((needpgs - freepgs))
echo 3 > /proc/sys/vm/drop_caches
- if ! echo $((lackpgs + nr_hugepgs)) > /proc/sys/vm/nr_hugepages; then
+ if ! echo $((lackpgs + orig_nr_hugepgs)) > /proc/sys/vm/nr_hugepages; then
echo "Please run this test as root"
exit $ksft_skip
fi
@@ -189,6 +189,7 @@ if [ -n "$freepgs" ] && [ -n "$hpgsize_KB" ]; then
done < /proc/meminfo
tries=$((tries - 1))
done
+ nr_hugepgs=$(cat /proc/sys/vm/nr_hugepages)
if [ "$freepgs" -lt "$needpgs" ]; then
printf "Not enough huge pages available (%d < %d)\n" \
"$freepgs" "$needpgs"
@@ -532,6 +533,10 @@ CATEGORY="page_frag" run_test ./test_page_frag.sh aligned
CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned
+if [ "${HAVE_HUGEPAGES}" = 1 ]; then
+ echo "$orig_nr_hugepgs" > /proc/sys/vm/nr_hugepages
+fi
+
echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix
echo "1..${count_total}" | tap_output
--
2.49.0
Alloc hugepages in the test internally, so we don't fully rely on the
run_vmtests.sh. If run_vmtests.sh does that great, free hugepages is
enough for being used to run the test, leave it as it is, otherwise setup
the hugepages in the test.
Save the original nr_hugepages value and restore it after test finish, so
leave a stable test envronment.
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
---
.../selftests/mm/va_high_addr_switch.sh | 37 +++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/tools/testing/selftests/mm/va_high_addr_switch.sh b/tools/testing/selftests/mm/va_high_addr_switch.sh
index 325de53966b6..a7d4b02b21dd 100755
--- a/tools/testing/selftests/mm/va_high_addr_switch.sh
+++ b/tools/testing/selftests/mm/va_high_addr_switch.sh
@@ -9,6 +9,7 @@
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
+orig_nr_hugepages=0
skip()
{
@@ -76,5 +77,41 @@ check_test_requirements()
esac
}
+save_nr_hugepages()
+{
+ orig_nr_hugepages=$(cat /proc/sys/vm/nr_hugepages)
+}
+
+restore_nr_hugepages()
+{
+ echo "$orig_nr_hugepages" > /proc/sys/vm/nr_hugepages
+}
+
+setup_nr_hugepages()
+{
+ local needpgs=$1
+ while read -r name size unit; do
+ if [ "$name" = "HugePages_Free:" ]; then
+ freepgs="$size"
+ break
+ fi
+ done < /proc/meminfo
+ if [ "$freepgs" -ge "$needpgs" ]; then
+ return
+ fi
+ local hpgs=$((orig_nr_hugepages + needpgs))
+ echo $hpgs > /proc/sys/vm/nr_hugepages
+
+ local nr_hugepgs=$(cat /proc/sys/vm/nr_hugepages)
+ if [ "$nr_hugepgs" != "$hpgs" ]; then
+ restore_nr_hugepages
+ skip "$0: no enough hugepages for testing"
+ fi
+}
+
check_test_requirements
+save_nr_hugepages
+# 4 keep_mapped pages, and one for tmp usage
+setup_nr_hugepages 5
./va_high_addr_switch --run-hugetlb
+restore_nr_hugepages
--
2.49.0
The test will fail as below on x86_64 with cpu la57 support (will skip if
no la57 support). Note, the test requries nr_hugepages to be set first.
# running bash ./va_high_addr_switch.sh
# -------------------------------------
# mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK
# mmap(addr_switch_hint - pagesize, (2 * pagesize)): 0x7f55b60f9000 - OK
# mmap(addr_switch_hint, pagesize): 0x800000000000 - OK
# mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK
# mmap(NULL): 0x7f55b60f9000 - OK
# mmap(low_addr): 0x40000000 - OK
# mmap(high_addr): 0x1000000000000 - OK
# mmap(high_addr) again: 0xffff55b6136000 - OK
# mmap(high_addr, MAP_FIXED): 0x1000000000000 - OK
# mmap(-1): 0xffff55b6134000 - OK
# mmap(-1) again: 0xffff55b6132000 - OK
# mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK
# mmap(addr_switch_hint - pagesize, 2 * pagesize): 0x7f55b60f9000 - OK
# mmap(addr_switch_hint - pagesize/2 , 2 * pagesize): 0x7f55b60f7000 - OK
# mmap(addr_switch_hint, pagesize): 0x800000000000 - OK
# mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK
# mmap(NULL, MAP_HUGETLB): 0x7f55b5c00000 - OK
# mmap(low_addr, MAP_HUGETLB): 0x40000000 - OK
# mmap(high_addr, MAP_HUGETLB): 0x1000000000000 - OK
# mmap(high_addr, MAP_HUGETLB) again: 0xffff55b5e00000 - OK
# mmap(high_addr, MAP_FIXED | MAP_HUGETLB): 0x1000000000000 - OK
# mmap(-1, MAP_HUGETLB): 0x7f55b5c00000 - OK
# mmap(-1, MAP_HUGETLB) again: 0x7f55b5a00000 - OK
# mmap(addr_switch_hint - pagesize, 2*hugepagesize, MAP_HUGETLB): 0x800000000000 - FAILED
# mmap(addr_switch_hint , 2*hugepagesize, MAP_FIXED | MAP_HUGETLB): 0x800000000000 - OK
# [FAIL]
addr_switch_hint is defined as DFEFAULT_MAP_WINDOW in the failed test (for
x86_64, DFEFAULT_MAP_WINDOW is defined as (1UL<<47) - pagesize) in 64 bit.
Before commit cc92882ee218 ("mm: drop hugetlb_get_unmapped_area{_*}
functions"), for x86_64 hugetlb_get_unmapped_area() is handled in arch code
arch/x86/mm/hugetlbpage.c and addr is checked with map_address_hint_valid()
after align with 'addr &= huge_page_mask(h)' which is a round down way, and
it will fail the check because the addr is within the DEFAULT_MAP_WINDOW but
(addr + len) is above the DFEFAULT_MAP_WINDOW. So it wil go through the
hugetlb_get_unmmaped_area_top_down() to find an area within the
DFEFAULT_MAP_WINDOW.
After commit cc92882ee218 ("mm: drop hugetlb_get_unmapped_area{_*}
functions"). The addr hint for hugetlb_get_unmmaped_area() will be rounded
up and aligned to hugepage size with ALIGN() for all arches. And after the
align, the addr will be above the default MAP_DEFAULT_WINDOW, and the
map_addresshint_valid() check will pass because both aligned addr (addr0)
and (addr + len) are above the DEFAULT_MAP_WINDOW, and the aligned hint
address (0x800000000000) is returned as an suitable gap is found there,
in arch_get_unmapped_area_topdown().
To still cover the case that addr is within the DEFAULT_MAP_WINDOW, and
addr + len is above the DFEFAULT_MAP_WINDOW, change to choose the last
hugepage aligned address within the DEFAULT_MAP_WINDOW as the hint addr,
and the addr + len (2 hugepages) will be one hugepage above the
DEFAULT_MAP_WINDOW. An aligned address won't be affected by the page
round up or round down from kernel, so it's determistic.
Fixes: cc92882ee218 ("mm: drop hugetlb_get_unmapped_area{_*} functions")
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
---
tools/testing/selftests/mm/va_high_addr_switch.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/va_high_addr_switch.c b/tools/testing/selftests/mm/va_high_addr_switch.c
index 896b3f73fc53..306eba825107 100644
--- a/tools/testing/selftests/mm/va_high_addr_switch.c
+++ b/tools/testing/selftests/mm/va_high_addr_switch.c
@@ -230,10 +230,10 @@ void testcases_init(void)
.msg = "mmap(-1, MAP_HUGETLB) again",
},
{
- .addr = (void *)(addr_switch_hint - pagesize),
+ .addr = (void *)(addr_switch_hint - hugepagesize),
.size = 2 * hugepagesize,
.flags = MAP_HUGETLB | MAP_PRIVATE | MAP_ANONYMOUS,
- .msg = "mmap(addr_switch_hint - pagesize, 2*hugepagesize, MAP_HUGETLB)",
+ .msg = "mmap(addr_switch_hint - hugepagesize, 2*hugepagesize, MAP_HUGETLB)",
.low_addr_required = 1,
.keep_mapped = 1,
},
--
2.49.0
On 12.09.25 03:37, Chunyu Hu wrote: > The test will fail as below on x86_64 with cpu la57 support (will skip if > no la57 support). Note, the test requries nr_hugepages to be set first. > > # running bash ./va_high_addr_switch.sh > # ------------------------------------- > # mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK > # mmap(addr_switch_hint - pagesize, (2 * pagesize)): 0x7f55b60f9000 - OK > # mmap(addr_switch_hint, pagesize): 0x800000000000 - OK > # mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK > # mmap(NULL): 0x7f55b60f9000 - OK > # mmap(low_addr): 0x40000000 - OK > # mmap(high_addr): 0x1000000000000 - OK > # mmap(high_addr) again: 0xffff55b6136000 - OK > # mmap(high_addr, MAP_FIXED): 0x1000000000000 - OK > # mmap(-1): 0xffff55b6134000 - OK > # mmap(-1) again: 0xffff55b6132000 - OK > # mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK > # mmap(addr_switch_hint - pagesize, 2 * pagesize): 0x7f55b60f9000 - OK > # mmap(addr_switch_hint - pagesize/2 , 2 * pagesize): 0x7f55b60f7000 - OK > # mmap(addr_switch_hint, pagesize): 0x800000000000 - OK > # mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK > # mmap(NULL, MAP_HUGETLB): 0x7f55b5c00000 - OK > # mmap(low_addr, MAP_HUGETLB): 0x40000000 - OK > # mmap(high_addr, MAP_HUGETLB): 0x1000000000000 - OK > # mmap(high_addr, MAP_HUGETLB) again: 0xffff55b5e00000 - OK > # mmap(high_addr, MAP_FIXED | MAP_HUGETLB): 0x1000000000000 - OK > # mmap(-1, MAP_HUGETLB): 0x7f55b5c00000 - OK > # mmap(-1, MAP_HUGETLB) again: 0x7f55b5a00000 - OK > # mmap(addr_switch_hint - pagesize, 2*hugepagesize, MAP_HUGETLB): 0x800000000000 - FAILED > # mmap(addr_switch_hint , 2*hugepagesize, MAP_FIXED | MAP_HUGETLB): 0x800000000000 - OK > # [FAIL] > > addr_switch_hint is defined as DFEFAULT_MAP_WINDOW in the failed test (for > x86_64, DFEFAULT_MAP_WINDOW is defined as (1UL<<47) - pagesize) in 64 bit. > > Before commit cc92882ee218 ("mm: drop hugetlb_get_unmapped_area{_*} > functions"), for x86_64 hugetlb_get_unmapped_area() is handled in arch code > arch/x86/mm/hugetlbpage.c and addr is checked with map_address_hint_valid() > after align with 'addr &= huge_page_mask(h)' which is a round down way, and > it will fail the check because the addr is within the DEFAULT_MAP_WINDOW but > (addr + len) is above the DFEFAULT_MAP_WINDOW. So it wil go through the > hugetlb_get_unmmaped_area_top_down() to find an area within the > DFEFAULT_MAP_WINDOW. > > After commit cc92882ee218 ("mm: drop hugetlb_get_unmapped_area{_*} > functions"). The addr hint for hugetlb_get_unmmaped_area() will be rounded > up and aligned to hugepage size with ALIGN() for all arches. And after the > align, the addr will be above the default MAP_DEFAULT_WINDOW, and the > map_addresshint_valid() check will pass because both aligned addr (addr0) > and (addr + len) are above the DEFAULT_MAP_WINDOW, and the aligned hint > address (0x800000000000) is returned as an suitable gap is found there, > in arch_get_unmapped_area_topdown(). > > To still cover the case that addr is within the DEFAULT_MAP_WINDOW, and > addr + len is above the DFEFAULT_MAP_WINDOW, change to choose the last > hugepage aligned address within the DEFAULT_MAP_WINDOW as the hint addr, > and the addr + len (2 hugepages) will be one hugepage above the > DEFAULT_MAP_WINDOW. An aligned address won't be affected by the page > round up or round down from kernel, so it's determistic. > > Fixes: cc92882ee218 ("mm: drop hugetlb_get_unmapped_area{_*} functions") > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Chunyu Hu <chuhu@redhat.com> > --- Acked-by: David Hildenbrand <david@redhat.com> -- Cheers David / dhildenb
© 2016 - 2025 Red Hat, Inc.