From: Donet Tom <donettom@linux.ibm.com>
This patch fixed 2 issues.
1) After fork() in test_prctl_fork, the child process uses the file
descriptors from the parent process to read ksm_stat and
ksm_merging_pages. This results in incorrect values being read (parent
process ksm_stat and ksm_merging_pages will be read in child), causing
the test to fail.
This patch calls init_global_file_handles() in the child process to
ensure that the current process's file descriptors are used to read
ksm_stat and ksm_merging_pages.
2) All tests currently call ksm_merge to trigger page merging.
To ensure the system remains in a consistent state for subsequent
tests, it is better to call ksm_unmerge during the test cleanup phase.
In the test_prctl_fork test, after a fork(), reading ksm_merging_pages
in the child process returns a non-zero value because a previous test
performed a merge, and the child's memory state is inherited from the
parent.
Although the child process calls ksm_unmerge, the ksm_merging_pages
counter in the parent is reset to zero, while the child's counter
remains unchanged. This discrepancy causes the test to fail.
To avoid this issue, each test should call ksm_unmerge during cleanup
to ensure the counter is reset and the system is in a clean state for
subsequent tests.
execv argument is an array of pointers to null-terminated strings.
In this patch we also added NULL in the execv argument.
Fixes: 6c47de3be3a0 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec")
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
tools/testing/selftests/mm/ksm_functional_tests.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
index d8bd1911dfc0..996dc6645570 100644
--- a/tools/testing/selftests/mm/ksm_functional_tests.c
+++ b/tools/testing/selftests/mm/ksm_functional_tests.c
@@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd;
static int pagemap_fd;
static size_t pagesize;
+static void init_global_file_handles(void);
+
static bool range_maps_duplicates(char *addr, unsigned long size)
{
unsigned long offs_a, offs_b, pfn_a, pfn_b;
@@ -274,6 +276,7 @@ static void test_unmerge(void)
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
+ ksm_unmerge();
munmap(map, size);
}
@@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void)
ksft_test_result(!range_maps_duplicates(map, size),
"KSM zero pages were unmerged\n");
unmap:
+ ksm_unmerge();
munmap(map, size);
}
@@ -366,6 +370,7 @@ static void test_unmerge_discarded(void)
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
+ ksm_unmerge();
munmap(map, size);
}
@@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void)
close_uffd:
close(uffd);
unmap:
+ ksm_unmerge();
munmap(map, size);
}
#endif
@@ -515,6 +521,7 @@ static int test_child_ksm(void)
else if (map == MAP_MERGE_SKIP)
return -3;
+ ksm_unmerge();
munmap(map, size);
return 0;
}
@@ -548,6 +555,7 @@ static void test_prctl_fork(void)
child_pid = fork();
if (!child_pid) {
+ init_global_file_handles();
exit(test_child_ksm());
} else if (child_pid < 0) {
ksft_test_result_fail("fork() failed\n");
@@ -595,7 +603,7 @@ static void test_prctl_fork_exec(void)
return;
} else if (child_pid == 0) {
char *prg_name = "./ksm_functional_tests";
- char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
+ char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME, NULL };
execv(prg_name, argv_for_program);
return;
@@ -644,6 +652,7 @@ static void test_prctl_unmerge(void)
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
+ ksm_unmerge();
munmap(map, size);
}
@@ -677,6 +686,7 @@ static void test_prot_none(void)
ksft_test_result(!range_maps_duplicates(map, size),
"Pages were unmerged\n");
unmap:
+ ksm_unmerge();
munmap(map, size);
}
--
2.47.1
On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote: >From: Donet Tom <donettom@linux.ibm.com> > >This patch fixed 2 issues. > >1) After fork() in test_prctl_fork, the child process uses the file >descriptors from the parent process to read ksm_stat and >ksm_merging_pages. This results in incorrect values being read (parent >process ksm_stat and ksm_merging_pages will be read in child), causing >the test to fail. > >This patch calls init_global_file_handles() in the child process to >ensure that the current process's file descriptors are used to read >ksm_stat and ksm_merging_pages. > >2) All tests currently call ksm_merge to trigger page merging. >To ensure the system remains in a consistent state for subsequent >tests, it is better to call ksm_unmerge during the test cleanup phase. > >In the test_prctl_fork test, after a fork(), reading ksm_merging_pages >in the child process returns a non-zero value because a previous test >performed a merge, and the child's memory state is inherited from the >parent. > >Although the child process calls ksm_unmerge, the ksm_merging_pages >counter in the parent is reset to zero, while the child's counter >remains unchanged. This discrepancy causes the test to fail. > >To avoid this issue, each test should call ksm_unmerge during cleanup >to ensure the counter is reset and the system is in a clean state for >subsequent tests. > >execv argument is an array of pointers to null-terminated strings. >In this patch we also added NULL in the execv argument. > >Fixes: 6c47de3be3a0 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec") >Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com> >Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> >Signed-off-by: Donet Tom <donettom@linux.ibm.com> >--- > tools/testing/selftests/mm/ksm_functional_tests.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > >diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c >index d8bd1911dfc0..996dc6645570 100644 >--- a/tools/testing/selftests/mm/ksm_functional_tests.c >+++ b/tools/testing/selftests/mm/ksm_functional_tests.c >@@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd; > static int pagemap_fd; > static size_t pagesize; > >+static void init_global_file_handles(void); >+ > static bool range_maps_duplicates(char *addr, unsigned long size) > { > unsigned long offs_a, offs_b, pfn_a, pfn_b; >@@ -274,6 +276,7 @@ static void test_unmerge(void) > ksft_test_result(!range_maps_duplicates(map, size), > "Pages were unmerged\n"); > unmap: >+ ksm_unmerge(); In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help? Not very familiar with ksm stuff. Would you mind giving more on how this fix the failure you see? > munmap(map, size); > } > >@@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void) > ksft_test_result(!range_maps_duplicates(map, size), > "KSM zero pages were unmerged\n"); > unmap: >+ ksm_unmerge(); > munmap(map, size); > } > >@@ -366,6 +370,7 @@ static void test_unmerge_discarded(void) > ksft_test_result(!range_maps_duplicates(map, size), > "Pages were unmerged\n"); > unmap: >+ ksm_unmerge(); > munmap(map, size); > } > >@@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void) > close_uffd: > close(uffd); > unmap: >+ ksm_unmerge(); > munmap(map, size); > } > #endif >@@ -515,6 +521,7 @@ static int test_child_ksm(void) > else if (map == MAP_MERGE_SKIP) > return -3; > >+ ksm_unmerge(); > munmap(map, size); > return 0; > } >@@ -548,6 +555,7 @@ static void test_prctl_fork(void) > > child_pid = fork(); > if (!child_pid) { >+ init_global_file_handles(); Would this leave fd in parent as orphan? > exit(test_child_ksm()); > } else if (child_pid < 0) { > ksft_test_result_fail("fork() failed\n"); >@@ -595,7 +603,7 @@ static void test_prctl_fork_exec(void) > return; > } else if (child_pid == 0) { > char *prg_name = "./ksm_functional_tests"; >- char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME }; >+ char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME, NULL }; > > execv(prg_name, argv_for_program); > return; >@@ -644,6 +652,7 @@ static void test_prctl_unmerge(void) > ksft_test_result(!range_maps_duplicates(map, size), > "Pages were unmerged\n"); > unmap: >+ ksm_unmerge(); > munmap(map, size); > } > >@@ -677,6 +686,7 @@ static void test_prot_none(void) > ksft_test_result(!range_maps_duplicates(map, size), > "Pages were unmerged\n"); > unmap: >+ ksm_unmerge(); > munmap(map, size); > } > >-- >2.47.1 > -- Wei Yang Help you, Help me
>> } >> >> @@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "KSM zero pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> >> @@ -366,6 +370,7 @@ static void test_unmerge_discarded(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "Pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> >> @@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void) >> close_uffd: >> close(uffd); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> #endif >> @@ -515,6 +521,7 @@ static int test_child_ksm(void) >> else if (map == MAP_MERGE_SKIP) >> return -3; >> >> + ksm_unmerge(); >> munmap(map, size); >> return 0; >> } >> @@ -548,6 +555,7 @@ static void test_prctl_fork(void) >> >> child_pid = fork(); >> if (!child_pid) { >> + init_global_file_handles(); > > Would this leave fd in parent as orphan? Probably yes, but only until the child quits, so likely we don't care. -- Cheers, David / dhildenb
On 8/4/25 2:41 PM, Wei Yang wrote: > On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote: >> From: Donet Tom <donettom@linux.ibm.com> >> >> This patch fixed 2 issues. >> >> 1) After fork() in test_prctl_fork, the child process uses the file >> descriptors from the parent process to read ksm_stat and >> ksm_merging_pages. This results in incorrect values being read (parent >> process ksm_stat and ksm_merging_pages will be read in child), causing >> the test to fail. >> >> This patch calls init_global_file_handles() in the child process to >> ensure that the current process's file descriptors are used to read >> ksm_stat and ksm_merging_pages. >> >> 2) All tests currently call ksm_merge to trigger page merging. >> To ensure the system remains in a consistent state for subsequent >> tests, it is better to call ksm_unmerge during the test cleanup phase. >> >> In the test_prctl_fork test, after a fork(), reading ksm_merging_pages >> in the child process returns a non-zero value because a previous test >> performed a merge, and the child's memory state is inherited from the >> parent. >> >> Although the child process calls ksm_unmerge, the ksm_merging_pages >> counter in the parent is reset to zero, while the child's counter >> remains unchanged. This discrepancy causes the test to fail. >> >> To avoid this issue, each test should call ksm_unmerge during cleanup >> to ensure the counter is reset and the system is in a clean state for >> subsequent tests. >> >> execv argument is an array of pointers to null-terminated strings. >> In this patch we also added NULL in the execv argument. >> >> Fixes: 6c47de3be3a0 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec") >> Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com> >> Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> >> Signed-off-by: Donet Tom <donettom@linux.ibm.com> >> --- >> tools/testing/selftests/mm/ksm_functional_tests.c | 12 +++++++++++- >> 1 file changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c >> index d8bd1911dfc0..996dc6645570 100644 >> --- a/tools/testing/selftests/mm/ksm_functional_tests.c >> +++ b/tools/testing/selftests/mm/ksm_functional_tests.c >> @@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd; >> static int pagemap_fd; >> static size_t pagesize; >> >> +static void init_global_file_handles(void); >> + >> static bool range_maps_duplicates(char *addr, unsigned long size) >> { >> unsigned long offs_a, offs_b, pfn_a, pfn_b; >> @@ -274,6 +276,7 @@ static void test_unmerge(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "Pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); > In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help? > > Not very familiar with ksm stuff. Would you mind giving more on how this fix > the failure you see? The issue I was facing here was test_prctl_fork was failing. # [RUN] test_prctl_fork # Still pages merged # This issue occurred because the previous test performed a merge, causing the value of /proc/self/ksm_merging_pages to reflect the number of deduplicated pages. After that, a fork() was called. Post-fork, the child process inherited the parent's ksm_merging_pages value. Then, the child process invoked __mmap_and_merge_range(), which resulted in unmerging the pages and resetting the value. However, since the parent process had performed the merge, its ksm_merging_pages value also got reset to 0. Meanwhile, the child process had not performed any merge itself, so the inherited value remained unchanged. That’s why get_my_merging_page() in the child was returning a non-zero value. Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and that resolved the problem. Later, I decided it would be cleaner to move the ksm_unmerge() call to the test cleanup phase. > >> munmap(map, size); >> } >> >> @@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "KSM zero pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> >> @@ -366,6 +370,7 @@ static void test_unmerge_discarded(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "Pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> >> @@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void) >> close_uffd: >> close(uffd); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> #endif >> @@ -515,6 +521,7 @@ static int test_child_ksm(void) >> else if (map == MAP_MERGE_SKIP) >> return -3; >> >> + ksm_unmerge(); >> munmap(map, size); >> return 0; >> } >> @@ -548,6 +555,7 @@ static void test_prctl_fork(void) >> >> child_pid = fork(); >> if (!child_pid) { >> + init_global_file_handles(); > Would this leave fd in parent as orphan? > >> exit(test_child_ksm()); >> } else if (child_pid < 0) { >> ksft_test_result_fail("fork() failed\n"); >> @@ -595,7 +603,7 @@ static void test_prctl_fork_exec(void) >> return; >> } else if (child_pid == 0) { >> char *prg_name = "./ksm_functional_tests"; >> - char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME }; >> + char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME, NULL }; >> >> execv(prg_name, argv_for_program); >> return; >> @@ -644,6 +652,7 @@ static void test_prctl_unmerge(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "Pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> >> @@ -677,6 +686,7 @@ static void test_prot_none(void) >> ksft_test_result(!range_maps_duplicates(map, size), >> "Pages were unmerged\n"); >> unmap: >> + ksm_unmerge(); >> munmap(map, size); >> } >> >> -- >> 2.47.1 >>
On Tue, Aug 05, 2025 at 11:39:15AM +0530, Donet Tom wrote: > >On 8/4/25 2:41 PM, Wei Yang wrote: >> On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote: >> > From: Donet Tom <donettom@linux.ibm.com> >> > [...] >> > diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c >> > index d8bd1911dfc0..996dc6645570 100644 >> > --- a/tools/testing/selftests/mm/ksm_functional_tests.c >> > +++ b/tools/testing/selftests/mm/ksm_functional_tests.c >> > @@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd; >> > static int pagemap_fd; >> > static size_t pagesize; >> > >> > +static void init_global_file_handles(void); >> > + >> > static bool range_maps_duplicates(char *addr, unsigned long size) >> > { >> > unsigned long offs_a, offs_b, pfn_a, pfn_b; >> > @@ -274,6 +276,7 @@ static void test_unmerge(void) >> > ksft_test_result(!range_maps_duplicates(map, size), >> > "Pages were unmerged\n"); >> > unmap: >> > + ksm_unmerge(); >> In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help? >> >> Not very familiar with ksm stuff. Would you mind giving more on how this fix >> the failure you see? > > >The issue I was facing here was test_prctl_fork was failing. > ># [RUN] test_prctl_fork ># Still pages merged ># > >This issue occurred because the previous test performed a merge, causing >the value of /proc/self/ksm_merging_pages to reflect the number of >deduplicated pages. After that, a fork() was called. Post-fork, the child >process >inherited the parent's ksm_merging_pages value. > Yes, this one is fixed by calling init_global_file_handles() in child. >Then, the child process invoked __mmap_and_merge_range(), which resulted >in unmerging the pages and resetting the value. However, since the parent >process >had performed the merge, its ksm_merging_pages value also got reset to 0. >Meanwhile, the child process had not performed any merge itself, so the >inherited I assume the behavior described here is after the change to call init_global_file_handles() in child. Child process inherit the ksm_merging_pages from parent, which is reasonable to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages for parent and leave ksm_merging_pages in child process unchanged. ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs interface. I expect it applies to both parent and child. >value remained unchanged. That’s why get_my_merging_page() in the child was >returning a non-zero value. > I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge() couldn't reset the value, but a ksm_unmerge() in parent could. >Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and >that >resolved the problem. Later, I decided it would be cleaner to move the >ksm_unmerge() call to the test cleanup phase. > Also all the tests before test_prctl_fork(), except test_prctl(), calls ksft_test_result(!range_maps_duplicates()); If the previous tests succeed, it means there is no duplicate pages. This means ksm_merging_pages should be 0 before test_prctl_fork() if other tests pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test proves it.) So which part of the story I missed? -- Wei Yang Help you, Help me
On 8/5/25 10:33 PM, Wei Yang wrote: > On Tue, Aug 05, 2025 at 11:39:15AM +0530, Donet Tom wrote: >> On 8/4/25 2:41 PM, Wei Yang wrote: >>> On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote: >>>> From: Donet Tom <donettom@linux.ibm.com> >>>> > [...] >>>> diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c >>>> index d8bd1911dfc0..996dc6645570 100644 >>>> --- a/tools/testing/selftests/mm/ksm_functional_tests.c >>>> +++ b/tools/testing/selftests/mm/ksm_functional_tests.c >>>> @@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd; >>>> static int pagemap_fd; >>>> static size_t pagesize; >>>> >>>> +static void init_global_file_handles(void); >>>> + >>>> static bool range_maps_duplicates(char *addr, unsigned long size) >>>> { >>>> unsigned long offs_a, offs_b, pfn_a, pfn_b; >>>> @@ -274,6 +276,7 @@ static void test_unmerge(void) >>>> ksft_test_result(!range_maps_duplicates(map, size), >>>> "Pages were unmerged\n"); >>>> unmap: >>>> + ksm_unmerge(); >>> In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help? >>> >>> Not very familiar with ksm stuff. Would you mind giving more on how this fix >>> the failure you see? >> >> The issue I was facing here was test_prctl_fork was failing. >> >> # [RUN] test_prctl_fork >> # Still pages merged >> # >> >> This issue occurred because the previous test performed a merge, causing >> the value of /proc/self/ksm_merging_pages to reflect the number of >> deduplicated pages. After that, a fork() was called. Post-fork, the child >> process >> inherited the parent's ksm_merging_pages value. >> > Yes, this one is fixed by calling init_global_file_handles() in child. > >> Then, the child process invoked __mmap_and_merge_range(), which resulted >> in unmerging the pages and resetting the value. However, since the parent >> process >> had performed the merge, its ksm_merging_pages value also got reset to 0. >> Meanwhile, the child process had not performed any merge itself, so the >> inherited > I assume the behavior described here is after the change to call > init_global_file_handles() in child. Yes > > Child process inherit the ksm_merging_pages from parent, which is reasonable > to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages > for parent and leave ksm_merging_pages in child process unchanged. > > ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs > interface. I expect it applies to both parent and child. I am not very familiar with the KSM code, but from what I understand: The ksm_merging_pages counter is maintained per mm_struct. When we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the counters are updated for all mm_structs present in the ksm_mm_slot list. A mm_struct gets added to this list when MADV_MERGEABLE is called. In the case of the child process, since MADV_MERGEABLE has not been invoked yet, its mm_struct is not part of the list. As a result, its ksm_merging_pages counter is not reset. >> value remained unchanged. That’s why get_my_merging_page() in the child was >> returning a non-zero value. >> > I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return > a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge() > couldn't reset the value, but a ksm_unmerge() in parent could. > >> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and >> that >> resolved the problem. Later, I decided it would be cleaner to move the >> ksm_unmerge() call to the test cleanup phase. >> > Also all the tests before test_prctl_fork(), except test_prctl(), calls > > ksft_test_result(!range_maps_duplicates()); > > If the previous tests succeed, it means there is no duplicate pages. This > means ksm_merging_pages should be 0 before test_prctl_fork() if other tests > pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test > proves it.) If I understand correctly, all the tests are calling MADV_UNMERGEABLE, which internally calls break_ksm() in the kernel. This function replaces the KSM page with an exclusive anonymous page. However, the ksm_merging_pages counters are not updated at this point. The function range_maps_duplicates(map, size) checks whether the pages have been unmerged. Since break_ksm() does perform the unmerge, this function returns false, and the test passes. The ksm_merging_pages update happens later via the ksm_scan_thread(). That’s why we observe that ksm_merging_pages values are not reset immediately after the test finishes. If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that the ksm_merging_pages values are reset after the sleep. Once the test completes successfully, we can call ksm_unmerge(), which will immediately reset the ksm_merging_pages value. This way, in the fork test, the child process will also see the correct value. > > So which part of the story I missed? > So, during the cleanup phase after a successful test, we can call ksm_unmerge() to reset the counter. Do you see any issue with this approach? >
On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote: [...] >> Child process inherit the ksm_merging_pages from parent, which is reasonable >> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages >> for parent and leave ksm_merging_pages in child process unchanged. >> >> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs >> interface. I expect it applies to both parent and child. > >I am not very familiar with the KSM code, but from what I understand: > >The ksm_merging_pages counter is maintained per mm_struct. When >we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the >counters are updated for all mm_structs present in the ksm_mm_slot list. > >A mm_struct gets added to this list when MADV_MERGEABLE is called. >In the case of the child process, since MADV_MERGEABLE has not been >invoked yet, its mm_struct is not part of the list. As a result, >its ksm_merging_pages counter is not reset. > Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma I don't see it would be dropped during fork. Maybe missed. > >> > value remained unchanged. That’s why get_my_merging_page() in the child was >> > returning a non-zero value. >> > >> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return >> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge() >> couldn't reset the value, but a ksm_unmerge() in parent could. >> >> > Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and >> > that >> > resolved the problem. Later, I decided it would be cleaner to move the >> > ksm_unmerge() call to the test cleanup phase. >> > >> Also all the tests before test_prctl_fork(), except test_prctl(), calls >> >> ksft_test_result(!range_maps_duplicates()); >> >> If the previous tests succeed, it means there is no duplicate pages. This >> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests >> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test >> proves it.) > > >If I understand correctly, all the tests are calling MADV_UNMERGEABLE, >which internally calls break_ksm() in the kernel. This function replaces the >KSM page with an exclusive anonymous page. However, the >ksm_merging_pages counters are not updated at this point. > >The function range_maps_duplicates(map, size) checks whether the pages >have been unmerged. Since break_ksm() does perform the unmerge, this >function returns false, and the test passes. > >The ksm_merging_pages update happens later via the ksm_scan_thread(). >That’s why we observe that ksm_merging_pages values are not reset >immediately after the test finishes. > Not familiar with ksm internal. But the ksm_merging_pages counter still has non-zero value when all merged pages are unmerged makes me feel odd. >If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that >the ksm_merging_pages values are reset after the sleep. > >Once the test completes successfully, we can call ksm_unmerge(), which >will immediately reset the ksm_merging_pages value. This way, in the fork >test, the child process will also see the correct value. >> >> So which part of the story I missed? >> > >So, during the cleanup phase after a successful test, we can call >ksm_unmerge() to reset the counter. Do you see any issue with >this approach? > It looks there is no issue with an extra ksm_unmerge(). But one more question. Why an extra ksm_unmerge() could help. Here is what we have during test: test_prot_none() !range_maps_duplicates() ksm_unmerge() 1) <--- newly add test_prctl_fork() >--- in child __mmap_and_merge_range() ksm_unmerge() 2) <--- already have As you mentioned above ksm_unmerge() would immediately reset ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages non-zero? And the one at 1) could help. Or there is still some timing issue like sleep(1) you did? -- Wei Yang Help you, Help me
On 8/6/25 8:24 PM, Wei Yang wrote: > On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote: > [...] >>> Child process inherit the ksm_merging_pages from parent, which is reasonable >>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages >>> for parent and leave ksm_merging_pages in child process unchanged. >>> >>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs >>> interface. I expect it applies to both parent and child. >> I am not very familiar with the KSM code, but from what I understand: >> >> The ksm_merging_pages counter is maintained per mm_struct. When >> we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the >> counters are updated for all mm_structs present in the ksm_mm_slot list. >> >> A mm_struct gets added to this list when MADV_MERGEABLE is called. >> In the case of the child process, since MADV_MERGEABLE has not been >> invoked yet, its mm_struct is not part of the list. As a result, >> its ksm_merging_pages counter is not reset. >> > Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma > I don't see it would be dropped during fork. Maybe missed. > >>>> value remained unchanged. That’s why get_my_merging_page() in the child was >>>> returning a non-zero value. >>>> >>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return >>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge() >>> couldn't reset the value, but a ksm_unmerge() in parent could. >>> >>>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and >>>> that >>>> resolved the problem. Later, I decided it would be cleaner to move the >>>> ksm_unmerge() call to the test cleanup phase. >>>> >>> Also all the tests before test_prctl_fork(), except test_prctl(), calls >>> >>> ksft_test_result(!range_maps_duplicates()); >>> >>> If the previous tests succeed, it means there is no duplicate pages. This >>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests >>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test >>> proves it.) >> >> If I understand correctly, all the tests are calling MADV_UNMERGEABLE, >> which internally calls break_ksm() in the kernel. This function replaces the >> KSM page with an exclusive anonymous page. However, the >> ksm_merging_pages counters are not updated at this point. >> >> The function range_maps_duplicates(map, size) checks whether the pages >> have been unmerged. Since break_ksm() does perform the unmerge, this >> function returns false, and the test passes. >> >> The ksm_merging_pages update happens later via the ksm_scan_thread(). >> That’s why we observe that ksm_merging_pages values are not reset >> immediately after the test finishes. >> > Not familiar with ksm internal. But the ksm_merging_pages counter still has > non-zero value when all merged pages are unmerged makes me feel odd. > >> If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that >> the ksm_merging_pages values are reset after the sleep. >> >> Once the test completes successfully, we can call ksm_unmerge(), which >> will immediately reset the ksm_merging_pages value. This way, in the fork >> test, the child process will also see the correct value. >>> So which part of the story I missed? >>> >> So, during the cleanup phase after a successful test, we can call >> ksm_unmerge() to reset the counter. Do you see any issue with >> this approach? >> > It looks there is no issue with an extra ksm_unmerge(). > > But one more question. Why an extra ksm_unmerge() could help. > > Here is what we have during test: > > > test_prot_none() > !range_maps_duplicates() > ksm_unmerge() 1) <--- newly add > test_prctl_fork() > >--- in child > __mmap_and_merge_range() > ksm_unmerge() 2) <--- already have > > As you mentioned above ksm_unmerge() would immediately reset > ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages > non-zero? And the one at 1) could help. From the debugging, what I understood is: When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the mm_struct of the process gets added to the ksm_mm_slot list. As a result, both the parent and child processes’ mm_struct structures will be present in ksm_mm_slot. When KSM merges the pages, it creates a ksm_rmap_item for each page, and the ksm_merging_pages counter is incremented accordingly. Since the parent process did the merge, its mm_struct is present in ksm_mm_slot, and ksm_rmap_item entries are created for all the merged pages. When a process is forked, the child’s mm_struct is also added to ksm_mm_slot, and it inherits the ksm_merging_pages count. However, no ksm_rmap_item entries are created for the child process because it did not do any merge. When ksm_unmerge() is called, it iterates over all processes in ksm_mm_slot. In our case, both the parent and child are present. It first processes the parent, which has ksm_rmap_item entries, so it unmerges the pages and resets the ksm_merging_pages counter. For the child, since it did not perform any actual merging, it does not have any ksm_rmap_item entries. Therefore, there are no pages to unmerge, and the counter remains unchanged. So, only processes that performed KSM merging will have their counters updated during ksm_unmerge(). The child process, having not initiated any merging, retains the inherited counter value without any update. So from a testing point of view, I think it is better to reset the counters as part of the cleanup code to ensure that the next tests do not get incorrect values. The question I have is: is it correct to keep the inherited |ksm_merging_page| value in the child or Should we reset it to 0 during |ksm_fork()|? > > Or there is still some timing issue like sleep(1) you did? >
On Thu, Aug 07, 2025 at 02:56:28PM +0530, Donet Tom wrote: > >On 8/6/25 8:24 PM, Wei Yang wrote: >> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote: >> [...] >> > > Child process inherit the ksm_merging_pages from parent, which is reasonable >> > > to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages >> > > for parent and leave ksm_merging_pages in child process unchanged. >> > > >> > > ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs >> > > interface. I expect it applies to both parent and child. >> > I am not very familiar with the KSM code, but from what I understand: >> > >> > The ksm_merging_pages counter is maintained per mm_struct. When >> > we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the >> > counters are updated for all mm_structs present in the ksm_mm_slot list. >> > >> > A mm_struct gets added to this list when MADV_MERGEABLE is called. >> > In the case of the child process, since MADV_MERGEABLE has not been >> > invoked yet, its mm_struct is not part of the list. As a result, >> > its ksm_merging_pages counter is not reset. >> > >> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma >> I don't see it would be dropped during fork. Maybe missed. >> >> > > > value remained unchanged. That’s why get_my_merging_page() in the child was >> > > > returning a non-zero value. >> > > > >> > > I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return >> > > a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge() >> > > couldn't reset the value, but a ksm_unmerge() in parent could. >> > > >> > > > Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and >> > > > that >> > > > resolved the problem. Later, I decided it would be cleaner to move the >> > > > ksm_unmerge() call to the test cleanup phase. >> > > > >> > > Also all the tests before test_prctl_fork(), except test_prctl(), calls >> > > >> > > ksft_test_result(!range_maps_duplicates()); >> > > >> > > If the previous tests succeed, it means there is no duplicate pages. This >> > > means ksm_merging_pages should be 0 before test_prctl_fork() if other tests >> > > pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test >> > > proves it.) >> > >> > If I understand correctly, all the tests are calling MADV_UNMERGEABLE, >> > which internally calls break_ksm() in the kernel. This function replaces the >> > KSM page with an exclusive anonymous page. However, the >> > ksm_merging_pages counters are not updated at this point. >> > >> > The function range_maps_duplicates(map, size) checks whether the pages >> > have been unmerged. Since break_ksm() does perform the unmerge, this >> > function returns false, and the test passes. >> > >> > The ksm_merging_pages update happens later via the ksm_scan_thread(). >> > That’s why we observe that ksm_merging_pages values are not reset >> > immediately after the test finishes. >> > >> Not familiar with ksm internal. But the ksm_merging_pages counter still has >> non-zero value when all merged pages are unmerged makes me feel odd. >> >> > If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that >> > the ksm_merging_pages values are reset after the sleep. >> > >> > Once the test completes successfully, we can call ksm_unmerge(), which >> > will immediately reset the ksm_merging_pages value. This way, in the fork >> > test, the child process will also see the correct value. >> > > So which part of the story I missed? >> > > >> > So, during the cleanup phase after a successful test, we can call >> > ksm_unmerge() to reset the counter. Do you see any issue with >> > this approach? >> > >> It looks there is no issue with an extra ksm_unmerge(). >> >> But one more question. Why an extra ksm_unmerge() could help. >> >> Here is what we have during test: >> >> >> test_prot_none() >> !range_maps_duplicates() >> ksm_unmerge() 1) <--- newly add >> test_prctl_fork() >> >--- in child >> __mmap_and_merge_range() >> ksm_unmerge() 2) <--- already have >> >> As you mentioned above ksm_unmerge() would immediately reset >> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages >> non-zero? And the one at 1) could help. > > >From the debugging, what I understood is: > >When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the >mm_struct of the process gets added to the ksm_mm_slot list. As a >result, both the parent and child processes’ mm_struct structures >will be present in ksm_mm_slot. > >When KSM merges the pages, it creates a ksm_rmap_item for each page, >and the ksm_merging_pages counter is incremented accordingly. > >Since the parent process did the merge, its mm_struct is present in >ksm_mm_slot, and ksm_rmap_item entries are created for all the merged >pages. > >When a process is forked, the child’s mm_struct is also added to >ksm_mm_slot, and it inherits the ksm_merging_pages count. However, >no ksm_rmap_item entries are created for the child process because it >did not do any merge. > >When ksm_unmerge() is called, it iterates over all processes in >ksm_mm_slot. In our case, both the parent and child are present. It >first processes the parent, which has ksm_rmap_item entries, so it >unmerges the pages and resets the ksm_merging_pages counter. > >For the child, since it did not perform any actual merging, it does not >have any ksm_rmap_item entries. Therefore, there are no pages to unmerge, >and the counter remains unchanged. > Thanks for the detailed analysis. So the key is child has no ksm_rmap_item which will not clear ksm_merging_page on ksm_unmerge(). >So, only processes that performed KSM merging will have their counters >updated during ksm_unmerge(). The child process, having not initiated any >merging, retains the inherited counter value without any update. > >So from a testing point of view, I think it is better to reset the >counters as part of the cleanup code to ensure that the next tests do >not get incorrect values. > Hmm... I agree from the test point of view based on current situation. While maybe this is also a check point for later version. >The question I have is: is it correct to keep the inherited >|ksm_merging_page| >value in the child or Should we reset it to 0 during |ksm_fork()|? > Very good question. There looks to be something wrong, but I am not sure this is the correct way. >> >> Or there is still some timing issue like sleep(1) you did? >> -- Wei Yang Help you, Help me
On 8/8/25 8:28 AM, Wei Yang wrote: > On Thu, Aug 07, 2025 at 02:56:28PM +0530, Donet Tom wrote: >> On 8/6/25 8:24 PM, Wei Yang wrote: >>> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote: >>> [...] >>>>> Child process inherit the ksm_merging_pages from parent, which is reasonable >>>>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages >>>>> for parent and leave ksm_merging_pages in child process unchanged. >>>>> >>>>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs >>>>> interface. I expect it applies to both parent and child. >>>> I am not very familiar with the KSM code, but from what I understand: >>>> >>>> The ksm_merging_pages counter is maintained per mm_struct. When >>>> we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the >>>> counters are updated for all mm_structs present in the ksm_mm_slot list. >>>> >>>> A mm_struct gets added to this list when MADV_MERGEABLE is called. >>>> In the case of the child process, since MADV_MERGEABLE has not been >>>> invoked yet, its mm_struct is not part of the list. As a result, >>>> its ksm_merging_pages counter is not reset. >>>> >>> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma >>> I don't see it would be dropped during fork. Maybe missed. >>> >>>>>> value remained unchanged. That’s why get_my_merging_page() in the child was >>>>>> returning a non-zero value. >>>>>> >>>>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return >>>>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge() >>>>> couldn't reset the value, but a ksm_unmerge() in parent could. >>>>> >>>>>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and >>>>>> that >>>>>> resolved the problem. Later, I decided it would be cleaner to move the >>>>>> ksm_unmerge() call to the test cleanup phase. >>>>>> >>>>> Also all the tests before test_prctl_fork(), except test_prctl(), calls >>>>> >>>>> ksft_test_result(!range_maps_duplicates()); >>>>> >>>>> If the previous tests succeed, it means there is no duplicate pages. This >>>>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests >>>>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test >>>>> proves it.) >>>> If I understand correctly, all the tests are calling MADV_UNMERGEABLE, >>>> which internally calls break_ksm() in the kernel. This function replaces the >>>> KSM page with an exclusive anonymous page. However, the >>>> ksm_merging_pages counters are not updated at this point. >>>> >>>> The function range_maps_duplicates(map, size) checks whether the pages >>>> have been unmerged. Since break_ksm() does perform the unmerge, this >>>> function returns false, and the test passes. >>>> >>>> The ksm_merging_pages update happens later via the ksm_scan_thread(). >>>> That’s why we observe that ksm_merging_pages values are not reset >>>> immediately after the test finishes. >>>> >>> Not familiar with ksm internal. But the ksm_merging_pages counter still has >>> non-zero value when all merged pages are unmerged makes me feel odd. >>> >>>> If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that >>>> the ksm_merging_pages values are reset after the sleep. >>>> >>>> Once the test completes successfully, we can call ksm_unmerge(), which >>>> will immediately reset the ksm_merging_pages value. This way, in the fork >>>> test, the child process will also see the correct value. >>>>> So which part of the story I missed? >>>>> >>>> So, during the cleanup phase after a successful test, we can call >>>> ksm_unmerge() to reset the counter. Do you see any issue with >>>> this approach? >>>> >>> It looks there is no issue with an extra ksm_unmerge(). >>> >>> But one more question. Why an extra ksm_unmerge() could help. >>> >>> Here is what we have during test: >>> >>> >>> test_prot_none() >>> !range_maps_duplicates() >>> ksm_unmerge() 1) <--- newly add >>> test_prctl_fork() >>> >--- in child >>> __mmap_and_merge_range() >>> ksm_unmerge() 2) <--- already have >>> >>> As you mentioned above ksm_unmerge() would immediately reset >>> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages >>> non-zero? And the one at 1) could help. >> > >From the debugging, what I understood is: >> When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the >> mm_struct of the process gets added to the ksm_mm_slot list. As a >> result, both the parent and child processes’ mm_struct structures >> will be present in ksm_mm_slot. >> >> When KSM merges the pages, it creates a ksm_rmap_item for each page, >> and the ksm_merging_pages counter is incremented accordingly. >> >> Since the parent process did the merge, its mm_struct is present in >> ksm_mm_slot, and ksm_rmap_item entries are created for all the merged >> pages. >> >> When a process is forked, the child’s mm_struct is also added to >> ksm_mm_slot, and it inherits the ksm_merging_pages count. However, >> no ksm_rmap_item entries are created for the child process because it >> did not do any merge. >> >> When ksm_unmerge() is called, it iterates over all processes in >> ksm_mm_slot. In our case, both the parent and child are present. It >> first processes the parent, which has ksm_rmap_item entries, so it >> unmerges the pages and resets the ksm_merging_pages counter. >> >> For the child, since it did not perform any actual merging, it does not >> have any ksm_rmap_item entries. Therefore, there are no pages to unmerge, >> and the counter remains unchanged. >> > Thanks for the detailed analysis. > > So the key is child has no ksm_rmap_item which will not clear ksm_merging_page > on ksm_unmerge(). > >> So, only processes that performed KSM merging will have their counters >> updated during ksm_unmerge(). The child process, having not initiated any >> merging, retains the inherited counter value without any update. >> >> So from a testing point of view, I think it is better to reset the >> counters as part of the cleanup code to ensure that the next tests do >> not get incorrect values. >> > Hmm... I agree from the test point of view based on current situation. > > While maybe this is also a check point for later version. Are you okay to proceed with the current patch in this series? > >> The question I have is: is it correct to keep the inherited >> |ksm_merging_page| >> value in the child or Should we reset it to 0 during |ksm_fork()|? >> > Very good question. There looks to be something wrong, but I am not sure this > is the correct way. ok. I am going through it and will come up with a fix along with a test for this scenario. I will post it as a separate series. > >>> Or there is still some timing issue like sleep(1) you did? >>>
On Fri, Aug 08, 2025 at 07:55:37PM +0530, Donet Tom wrote: [...] >> Thanks for the detailed analysis. >> >> So the key is child has no ksm_rmap_item which will not clear ksm_merging_page >> on ksm_unmerge(). >> >> > So, only processes that performed KSM merging will have their counters >> > updated during ksm_unmerge(). The child process, having not initiated any >> > merging, retains the inherited counter value without any update. >> > >> > So from a testing point of view, I think it is better to reset the >> > counters as part of the cleanup code to ensure that the next tests do >> > not get incorrect values. >> > >> Hmm... I agree from the test point of view based on current situation. >> >> While maybe this is also a check point for later version. > >Are you okay to proceed with the current patch in this series? > Sure. -- Wei Yang Help you, Help me
© 2016 - 2025 Red Hat, Inc.