[PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures

Aboorva Devarajan posted 7 patches 2 months, 1 week ago
There is a newer version of this series
[PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Aboorva Devarajan 2 months, 1 week ago
From: Donet Tom <donettom@linux.ibm.com>

This patch fixed 2 issues.

1) After fork() in test_prctl_fork, the child process uses the file
descriptors from the parent process to read ksm_stat and
ksm_merging_pages. This results in incorrect values being read (parent
process ksm_stat and ksm_merging_pages will be read in child), causing
the test to fail.

This patch calls init_global_file_handles() in the child process to
ensure that the current process's file descriptors are used to read
ksm_stat and ksm_merging_pages.

2) All tests currently call ksm_merge to trigger page merging.
To ensure the system remains in a consistent state for subsequent
tests, it is better to call ksm_unmerge during the test cleanup phase.

In the test_prctl_fork test, after a fork(), reading ksm_merging_pages
in the child process returns a non-zero value because a previous test
performed a merge, and the child's memory state is inherited from the
parent.

Although the child process calls ksm_unmerge, the ksm_merging_pages
counter in the parent is reset to zero, while the child's counter
remains unchanged. This discrepancy causes the test to fail.

To avoid this issue, each test should call ksm_unmerge during cleanup
to ensure the counter is reset and the system is in a clean state for
subsequent tests.

execv argument is an array of pointers to null-terminated strings.
In this patch we also added NULL in the execv argument.

Fixes: 6c47de3be3a0 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec")
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
 tools/testing/selftests/mm/ksm_functional_tests.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
index d8bd1911dfc0..996dc6645570 100644
--- a/tools/testing/selftests/mm/ksm_functional_tests.c
+++ b/tools/testing/selftests/mm/ksm_functional_tests.c
@@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd;
 static int pagemap_fd;
 static size_t pagesize;
 
+static void init_global_file_handles(void);
+
 static bool range_maps_duplicates(char *addr, unsigned long size)
 {
 	unsigned long offs_a, offs_b, pfn_a, pfn_b;
@@ -274,6 +276,7 @@ static void test_unmerge(void)
 	ksft_test_result(!range_maps_duplicates(map, size),
 			 "Pages were unmerged\n");
 unmap:
+	ksm_unmerge();
 	munmap(map, size);
 }
 
@@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void)
 	ksft_test_result(!range_maps_duplicates(map, size),
 			"KSM zero pages were unmerged\n");
 unmap:
+	ksm_unmerge();
 	munmap(map, size);
 }
 
@@ -366,6 +370,7 @@ static void test_unmerge_discarded(void)
 	ksft_test_result(!range_maps_duplicates(map, size),
 			 "Pages were unmerged\n");
 unmap:
+	ksm_unmerge();
 	munmap(map, size);
 }
 
@@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void)
 close_uffd:
 	close(uffd);
 unmap:
+	ksm_unmerge();
 	munmap(map, size);
 }
 #endif
@@ -515,6 +521,7 @@ static int test_child_ksm(void)
 	else if (map == MAP_MERGE_SKIP)
 		return -3;
 
+	ksm_unmerge();
 	munmap(map, size);
 	return 0;
 }
@@ -548,6 +555,7 @@ static void test_prctl_fork(void)
 
 	child_pid = fork();
 	if (!child_pid) {
+		init_global_file_handles();
 		exit(test_child_ksm());
 	} else if (child_pid < 0) {
 		ksft_test_result_fail("fork() failed\n");
@@ -595,7 +603,7 @@ static void test_prctl_fork_exec(void)
 		return;
 	} else if (child_pid == 0) {
 		char *prg_name = "./ksm_functional_tests";
-		char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
+		char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME, NULL };
 
 		execv(prg_name, argv_for_program);
 		return;
@@ -644,6 +652,7 @@ static void test_prctl_unmerge(void)
 	ksft_test_result(!range_maps_duplicates(map, size),
 			 "Pages were unmerged\n");
 unmap:
+	ksm_unmerge();
 	munmap(map, size);
 }
 
@@ -677,6 +686,7 @@ static void test_prot_none(void)
 	ksft_test_result(!range_maps_duplicates(map, size),
 			 "Pages were unmerged\n");
 unmap:
+	ksm_unmerge();
 	munmap(map, size);
 }
 
-- 
2.47.1
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Wei Yang 2 months ago
On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote:
>From: Donet Tom <donettom@linux.ibm.com>
>
>This patch fixed 2 issues.
>
>1) After fork() in test_prctl_fork, the child process uses the file
>descriptors from the parent process to read ksm_stat and
>ksm_merging_pages. This results in incorrect values being read (parent
>process ksm_stat and ksm_merging_pages will be read in child), causing
>the test to fail.
>
>This patch calls init_global_file_handles() in the child process to
>ensure that the current process's file descriptors are used to read
>ksm_stat and ksm_merging_pages.
>
>2) All tests currently call ksm_merge to trigger page merging.
>To ensure the system remains in a consistent state for subsequent
>tests, it is better to call ksm_unmerge during the test cleanup phase.
>
>In the test_prctl_fork test, after a fork(), reading ksm_merging_pages
>in the child process returns a non-zero value because a previous test
>performed a merge, and the child's memory state is inherited from the
>parent.
>
>Although the child process calls ksm_unmerge, the ksm_merging_pages
>counter in the parent is reset to zero, while the child's counter
>remains unchanged. This discrepancy causes the test to fail.
>
>To avoid this issue, each test should call ksm_unmerge during cleanup
>to ensure the counter is reset and the system is in a clean state for
>subsequent tests.
>
>execv argument is an array of pointers to null-terminated strings.
>In this patch we also added NULL in the execv argument.
>
>Fixes: 6c47de3be3a0 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec")
>Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>---
> tools/testing/selftests/mm/ksm_functional_tests.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
>diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
>index d8bd1911dfc0..996dc6645570 100644
>--- a/tools/testing/selftests/mm/ksm_functional_tests.c
>+++ b/tools/testing/selftests/mm/ksm_functional_tests.c
>@@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd;
> static int pagemap_fd;
> static size_t pagesize;
> 
>+static void init_global_file_handles(void);
>+
> static bool range_maps_duplicates(char *addr, unsigned long size)
> {
> 	unsigned long offs_a, offs_b, pfn_a, pfn_b;
>@@ -274,6 +276,7 @@ static void test_unmerge(void)
> 	ksft_test_result(!range_maps_duplicates(map, size),
> 			 "Pages were unmerged\n");
> unmap:
>+	ksm_unmerge();

In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help?

Not very familiar with ksm stuff. Would you mind giving more on how this fix
the failure you see?

> 	munmap(map, size);
> }
> 
>@@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void)
> 	ksft_test_result(!range_maps_duplicates(map, size),
> 			"KSM zero pages were unmerged\n");
> unmap:
>+	ksm_unmerge();
> 	munmap(map, size);
> }
> 
>@@ -366,6 +370,7 @@ static void test_unmerge_discarded(void)
> 	ksft_test_result(!range_maps_duplicates(map, size),
> 			 "Pages were unmerged\n");
> unmap:
>+	ksm_unmerge();
> 	munmap(map, size);
> }
> 
>@@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void)
> close_uffd:
> 	close(uffd);
> unmap:
>+	ksm_unmerge();
> 	munmap(map, size);
> }
> #endif
>@@ -515,6 +521,7 @@ static int test_child_ksm(void)
> 	else if (map == MAP_MERGE_SKIP)
> 		return -3;
> 
>+	ksm_unmerge();
> 	munmap(map, size);
> 	return 0;
> }
>@@ -548,6 +555,7 @@ static void test_prctl_fork(void)
> 
> 	child_pid = fork();
> 	if (!child_pid) {
>+		init_global_file_handles();

Would this leave fd in parent as orphan?

> 		exit(test_child_ksm());
> 	} else if (child_pid < 0) {
> 		ksft_test_result_fail("fork() failed\n");
>@@ -595,7 +603,7 @@ static void test_prctl_fork_exec(void)
> 		return;
> 	} else if (child_pid == 0) {
> 		char *prg_name = "./ksm_functional_tests";
>-		char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
>+		char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME, NULL };
> 
> 		execv(prg_name, argv_for_program);
> 		return;
>@@ -644,6 +652,7 @@ static void test_prctl_unmerge(void)
> 	ksft_test_result(!range_maps_duplicates(map, size),
> 			 "Pages were unmerged\n");
> unmap:
>+	ksm_unmerge();
> 	munmap(map, size);
> }
> 
>@@ -677,6 +686,7 @@ static void test_prot_none(void)
> 	ksft_test_result(!range_maps_duplicates(map, size),
> 			 "Pages were unmerged\n");
> unmap:
>+	ksm_unmerge();
> 	munmap(map, size);
> }
> 
>-- 
>2.47.1
>

-- 
Wei Yang
Help you, Help me
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by David Hildenbrand 2 months ago
>> }
>>
>> @@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			"KSM zero pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>>
>> @@ -366,6 +370,7 @@ static void test_unmerge_discarded(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			 "Pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>>
>> @@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void)
>> close_uffd:
>> 	close(uffd);
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>> #endif
>> @@ -515,6 +521,7 @@ static int test_child_ksm(void)
>> 	else if (map == MAP_MERGE_SKIP)
>> 		return -3;
>>
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> 	return 0;
>> }
>> @@ -548,6 +555,7 @@ static void test_prctl_fork(void)
>>
>> 	child_pid = fork();
>> 	if (!child_pid) {
>> +		init_global_file_handles();
> 
> Would this leave fd in parent as orphan?

Probably yes, but only until the child quits, so likely we don't care.

-- 
Cheers,

David / dhildenb
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Donet Tom 2 months ago
On 8/4/25 2:41 PM, Wei Yang wrote:
> On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote:
>> From: Donet Tom <donettom@linux.ibm.com>
>>
>> This patch fixed 2 issues.
>>
>> 1) After fork() in test_prctl_fork, the child process uses the file
>> descriptors from the parent process to read ksm_stat and
>> ksm_merging_pages. This results in incorrect values being read (parent
>> process ksm_stat and ksm_merging_pages will be read in child), causing
>> the test to fail.
>>
>> This patch calls init_global_file_handles() in the child process to
>> ensure that the current process's file descriptors are used to read
>> ksm_stat and ksm_merging_pages.
>>
>> 2) All tests currently call ksm_merge to trigger page merging.
>> To ensure the system remains in a consistent state for subsequent
>> tests, it is better to call ksm_unmerge during the test cleanup phase.
>>
>> In the test_prctl_fork test, after a fork(), reading ksm_merging_pages
>> in the child process returns a non-zero value because a previous test
>> performed a merge, and the child's memory state is inherited from the
>> parent.
>>
>> Although the child process calls ksm_unmerge, the ksm_merging_pages
>> counter in the parent is reset to zero, while the child's counter
>> remains unchanged. This discrepancy causes the test to fail.
>>
>> To avoid this issue, each test should call ksm_unmerge during cleanup
>> to ensure the counter is reset and the system is in a clean state for
>> subsequent tests.
>>
>> execv argument is an array of pointers to null-terminated strings.
>> In this patch we also added NULL in the execv argument.
>>
>> Fixes: 6c47de3be3a0 ("selftest/mm: ksm_functional_tests: extend test case for ksm fork/exec")
>> Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>> Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
>> Signed-off-by: Donet Tom <donettom@linux.ibm.com>
>> ---
>> tools/testing/selftests/mm/ksm_functional_tests.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
>> index d8bd1911dfc0..996dc6645570 100644
>> --- a/tools/testing/selftests/mm/ksm_functional_tests.c
>> +++ b/tools/testing/selftests/mm/ksm_functional_tests.c
>> @@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd;
>> static int pagemap_fd;
>> static size_t pagesize;
>>
>> +static void init_global_file_handles(void);
>> +
>> static bool range_maps_duplicates(char *addr, unsigned long size)
>> {
>> 	unsigned long offs_a, offs_b, pfn_a, pfn_b;
>> @@ -274,6 +276,7 @@ static void test_unmerge(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			 "Pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
> In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help?
>
> Not very familiar with ksm stuff. Would you mind giving more on how this fix
> the failure you see?


The issue I was facing here was test_prctl_fork was failing.

# [RUN] test_prctl_fork
# Still pages merged
#

This issue occurred because the previous test performed a merge, causing
the value of /proc/self/ksm_merging_pages to reflect the number of
deduplicated pages. After that, a fork() was called. Post-fork, the 
child process
inherited the parent's ksm_merging_pages value.

Then, the child process invoked __mmap_and_merge_range(), which resulted
in unmerging the pages and resetting the value. However, since the 
parent process
had performed the merge, its ksm_merging_pages value also got reset to 0.
Meanwhile, the child process had not performed any merge itself, so the 
inherited
value remained unchanged. That’s why get_my_merging_page() in the child was
returning a non-zero value.

Initially, I fixed the issue by calling ksm_unmerge() before the fork(), 
and that
resolved the problem. Later, I decided it would be cleaner to move the
ksm_unmerge() call to the test cleanup phase.


>
>> 	munmap(map, size);
>> }
>>
>> @@ -338,6 +341,7 @@ static void test_unmerge_zero_pages(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			"KSM zero pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>>
>> @@ -366,6 +370,7 @@ static void test_unmerge_discarded(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			 "Pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>>
>> @@ -452,6 +457,7 @@ static void test_unmerge_uffd_wp(void)
>> close_uffd:
>> 	close(uffd);
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>> #endif
>> @@ -515,6 +521,7 @@ static int test_child_ksm(void)
>> 	else if (map == MAP_MERGE_SKIP)
>> 		return -3;
>>
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> 	return 0;
>> }
>> @@ -548,6 +555,7 @@ static void test_prctl_fork(void)
>>
>> 	child_pid = fork();
>> 	if (!child_pid) {
>> +		init_global_file_handles();
> Would this leave fd in parent as orphan?
>
>> 		exit(test_child_ksm());
>> 	} else if (child_pid < 0) {
>> 		ksft_test_result_fail("fork() failed\n");
>> @@ -595,7 +603,7 @@ static void test_prctl_fork_exec(void)
>> 		return;
>> 	} else if (child_pid == 0) {
>> 		char *prg_name = "./ksm_functional_tests";
>> -		char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
>> +		char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME, NULL };
>>
>> 		execv(prg_name, argv_for_program);
>> 		return;
>> @@ -644,6 +652,7 @@ static void test_prctl_unmerge(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			 "Pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>>
>> @@ -677,6 +686,7 @@ static void test_prot_none(void)
>> 	ksft_test_result(!range_maps_duplicates(map, size),
>> 			 "Pages were unmerged\n");
>> unmap:
>> +	ksm_unmerge();
>> 	munmap(map, size);
>> }
>>
>> -- 
>> 2.47.1
>>
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Wei Yang 2 months ago
On Tue, Aug 05, 2025 at 11:39:15AM +0530, Donet Tom wrote:
>
>On 8/4/25 2:41 PM, Wei Yang wrote:
>> On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote:
>> > From: Donet Tom <donettom@linux.ibm.com>
>> > 
[...]
>> > diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
>> > index d8bd1911dfc0..996dc6645570 100644
>> > --- a/tools/testing/selftests/mm/ksm_functional_tests.c
>> > +++ b/tools/testing/selftests/mm/ksm_functional_tests.c
>> > @@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd;
>> > static int pagemap_fd;
>> > static size_t pagesize;
>> > 
>> > +static void init_global_file_handles(void);
>> > +
>> > static bool range_maps_duplicates(char *addr, unsigned long size)
>> > {
>> > 	unsigned long offs_a, offs_b, pfn_a, pfn_b;
>> > @@ -274,6 +276,7 @@ static void test_unmerge(void)
>> > 	ksft_test_result(!range_maps_duplicates(map, size),
>> > 			 "Pages were unmerged\n");
>> > unmap:
>> > +	ksm_unmerge();
>> In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help?
>> 
>> Not very familiar with ksm stuff. Would you mind giving more on how this fix
>> the failure you see?
>
>
>The issue I was facing here was test_prctl_fork was failing.
>
># [RUN] test_prctl_fork
># Still pages merged
>#
>
>This issue occurred because the previous test performed a merge, causing
>the value of /proc/self/ksm_merging_pages to reflect the number of
>deduplicated pages. After that, a fork() was called. Post-fork, the child
>process
>inherited the parent's ksm_merging_pages value.
>

Yes, this one is fixed by calling init_global_file_handles() in child.

>Then, the child process invoked __mmap_and_merge_range(), which resulted
>in unmerging the pages and resetting the value. However, since the parent
>process
>had performed the merge, its ksm_merging_pages value also got reset to 0.
>Meanwhile, the child process had not performed any merge itself, so the
>inherited

I assume the behavior described here is after the change to call
init_global_file_handles() in child.

Child process inherit the ksm_merging_pages from parent, which is reasonable
to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
for parent and leave ksm_merging_pages in child process unchanged.

ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
interface. I expect it applies to both parent and child.

>value remained unchanged. That’s why get_my_merging_page() in the child was
>returning a non-zero value.
>

I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
couldn't reset the value, but a ksm_unmerge() in parent could.

>Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>that
>resolved the problem. Later, I decided it would be cleaner to move the
>ksm_unmerge() call to the test cleanup phase.
>

Also all the tests before test_prctl_fork(), except test_prctl(), calls

  ksft_test_result(!range_maps_duplicates());

If the previous tests succeed, it means there is no duplicate pages. This
means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
proves it.)

So which part of the story I missed?

-- 
Wei Yang
Help you, Help me
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Donet Tom 2 months ago
On 8/5/25 10:33 PM, Wei Yang wrote:
> On Tue, Aug 05, 2025 at 11:39:15AM +0530, Donet Tom wrote:
>> On 8/4/25 2:41 PM, Wei Yang wrote:
>>> On Tue, Jul 29, 2025 at 11:03:59AM +0530, Aboorva Devarajan wrote:
>>>> From: Donet Tom <donettom@linux.ibm.com>
>>>>
> [...]
>>>> diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
>>>> index d8bd1911dfc0..996dc6645570 100644
>>>> --- a/tools/testing/selftests/mm/ksm_functional_tests.c
>>>> +++ b/tools/testing/selftests/mm/ksm_functional_tests.c
>>>> @@ -46,6 +46,8 @@ static int ksm_use_zero_pages_fd;
>>>> static int pagemap_fd;
>>>> static size_t pagesize;
>>>>
>>>> +static void init_global_file_handles(void);
>>>> +
>>>> static bool range_maps_duplicates(char *addr, unsigned long size)
>>>> {
>>>> 	unsigned long offs_a, offs_b, pfn_a, pfn_b;
>>>> @@ -274,6 +276,7 @@ static void test_unmerge(void)
>>>> 	ksft_test_result(!range_maps_duplicates(map, size),
>>>> 			 "Pages were unmerged\n");
>>>> unmap:
>>>> +	ksm_unmerge();
>>> In __mmap_and_merge_range(), we call ksm_unmerge(). Why this one not help?
>>>
>>> Not very familiar with ksm stuff. Would you mind giving more on how this fix
>>> the failure you see?
>>
>> The issue I was facing here was test_prctl_fork was failing.
>>
>> # [RUN] test_prctl_fork
>> # Still pages merged
>> #
>>
>> This issue occurred because the previous test performed a merge, causing
>> the value of /proc/self/ksm_merging_pages to reflect the number of
>> deduplicated pages. After that, a fork() was called. Post-fork, the child
>> process
>> inherited the parent's ksm_merging_pages value.
>>
> Yes, this one is fixed by calling init_global_file_handles() in child.
>
>> Then, the child process invoked __mmap_and_merge_range(), which resulted
>> in unmerging the pages and resetting the value. However, since the parent
>> process
>> had performed the merge, its ksm_merging_pages value also got reset to 0.
>> Meanwhile, the child process had not performed any merge itself, so the
>> inherited
> I assume the behavior described here is after the change to call
> init_global_file_handles() in child.

Yes


>
> Child process inherit the ksm_merging_pages from parent, which is reasonable
> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
> for parent and leave ksm_merging_pages in child process unchanged.
>
> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
> interface. I expect it applies to both parent and child.

I am not very familiar with the KSM code, but from what I understand:

The ksm_merging_pages counter is maintained per mm_struct. When
we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
counters are updated for all mm_structs present in the ksm_mm_slot list.

A mm_struct gets added to this list  when MADV_MERGEABLE is called.
In the case of the child process, since MADV_MERGEABLE has not been
invoked yet, its mm_struct is not part of the list. As a result,
its ksm_merging_pages counter is not reset.


>> value remained unchanged. That’s why get_my_merging_page() in the child was
>> returning a non-zero value.
>>
> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
> couldn't reset the value, but a ksm_unmerge() in parent could.
>
>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>> that
>> resolved the problem. Later, I decided it would be cleaner to move the
>> ksm_unmerge() call to the test cleanup phase.
>>
> Also all the tests before test_prctl_fork(), except test_prctl(), calls
>
>    ksft_test_result(!range_maps_duplicates());
>
> If the previous tests succeed, it means there is no duplicate pages. This
> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
> proves it.)


If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
which internally calls break_ksm() in the kernel. This function replaces the
KSM page with an exclusive anonymous page. However, the
ksm_merging_pages counters are not updated at this point.

The function range_maps_duplicates(map, size) checks whether the pages
have been unmerged. Since break_ksm() does perform the unmerge, this
function returns false, and the test passes.

The ksm_merging_pages update happens later via the ksm_scan_thread().
That’s why we observe that ksm_merging_pages values are not reset
immediately after the test finishes.

If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
the ksm_merging_pages values are reset after the sleep.

Once the test completes successfully, we can call ksm_unmerge(), which
will immediately reset the ksm_merging_pages value. This way, in the fork
test, the child process will also see the correct value.
>
> So which part of the story I missed?
>

So, during the cleanup phase after a successful test, we can call
ksm_unmerge() to reset the counter. Do you see any issue with
this approach?


>
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Wei Yang 2 months ago
On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote:
[...]
>> Child process inherit the ksm_merging_pages from parent, which is reasonable
>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
>> for parent and leave ksm_merging_pages in child process unchanged.
>> 
>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
>> interface. I expect it applies to both parent and child.
>
>I am not very familiar with the KSM code, but from what I understand:
>
>The ksm_merging_pages counter is maintained per mm_struct. When
>we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
>counters are updated for all mm_structs present in the ksm_mm_slot list.
>
>A mm_struct gets added to this list  when MADV_MERGEABLE is called.
>In the case of the child process, since MADV_MERGEABLE has not been
>invoked yet, its mm_struct is not part of the list. As a result,
>its ksm_merging_pages counter is not reset.
>

Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma
I don't see it would be dropped during fork. Maybe missed.

>
>> > value remained unchanged. That’s why get_my_merging_page() in the child was
>> > returning a non-zero value.
>> > 
>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
>> couldn't reset the value, but a ksm_unmerge() in parent could.
>> 
>> > Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>> > that
>> > resolved the problem. Later, I decided it would be cleaner to move the
>> > ksm_unmerge() call to the test cleanup phase.
>> > 
>> Also all the tests before test_prctl_fork(), except test_prctl(), calls
>> 
>>    ksft_test_result(!range_maps_duplicates());
>> 
>> If the previous tests succeed, it means there is no duplicate pages. This
>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
>> proves it.)
>
>
>If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
>which internally calls break_ksm() in the kernel. This function replaces the
>KSM page with an exclusive anonymous page. However, the
>ksm_merging_pages counters are not updated at this point.
>
>The function range_maps_duplicates(map, size) checks whether the pages
>have been unmerged. Since break_ksm() does perform the unmerge, this
>function returns false, and the test passes.
>
>The ksm_merging_pages update happens later via the ksm_scan_thread().
>That’s why we observe that ksm_merging_pages values are not reset
>immediately after the test finishes.
>

Not familiar with ksm internal. But the ksm_merging_pages counter still has
non-zero value when all merged pages are unmerged makes me feel odd.

>If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
>the ksm_merging_pages values are reset after the sleep.
>
>Once the test completes successfully, we can call ksm_unmerge(), which
>will immediately reset the ksm_merging_pages value. This way, in the fork
>test, the child process will also see the correct value.
>> 
>> So which part of the story I missed?
>> 
>
>So, during the cleanup phase after a successful test, we can call
>ksm_unmerge() to reset the counter. Do you see any issue with
>this approach?
>

It looks there is no issue with an extra ksm_unmerge().

But one more question. Why an extra ksm_unmerge() could help.

Here is what we have during test:


  test_prot_none()
      !range_maps_duplicates()
      ksm_unmerge()                  1) <--- newly add
  test_prctl_fork()
      >--- in child
      __mmap_and_merge_range()
          ksm_unmerge()              2) <--- already have

As you mentioned above ksm_unmerge() would immediately reset
ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages
non-zero? And the one at 1) could help.

Or there is still some timing issue like sleep(1) you did?

-- 
Wei Yang
Help you, Help me
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Donet Tom 1 month, 4 weeks ago
On 8/6/25 8:24 PM, Wei Yang wrote:
> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote:
> [...]
>>> Child process inherit the ksm_merging_pages from parent, which is reasonable
>>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
>>> for parent and leave ksm_merging_pages in child process unchanged.
>>>
>>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
>>> interface. I expect it applies to both parent and child.
>> I am not very familiar with the KSM code, but from what I understand:
>>
>> The ksm_merging_pages counter is maintained per mm_struct. When
>> we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
>> counters are updated for all mm_structs present in the ksm_mm_slot list.
>>
>> A mm_struct gets added to this list  when MADV_MERGEABLE is called.
>> In the case of the child process, since MADV_MERGEABLE has not been
>> invoked yet, its mm_struct is not part of the list. As a result,
>> its ksm_merging_pages counter is not reset.
>>
> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma
> I don't see it would be dropped during fork. Maybe missed.
>
>>>> value remained unchanged. That’s why get_my_merging_page() in the child was
>>>> returning a non-zero value.
>>>>
>>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
>>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
>>> couldn't reset the value, but a ksm_unmerge() in parent could.
>>>
>>>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>>>> that
>>>> resolved the problem. Later, I decided it would be cleaner to move the
>>>> ksm_unmerge() call to the test cleanup phase.
>>>>
>>> Also all the tests before test_prctl_fork(), except test_prctl(), calls
>>>
>>>     ksft_test_result(!range_maps_duplicates());
>>>
>>> If the previous tests succeed, it means there is no duplicate pages. This
>>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
>>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
>>> proves it.)
>>
>> If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
>> which internally calls break_ksm() in the kernel. This function replaces the
>> KSM page with an exclusive anonymous page. However, the
>> ksm_merging_pages counters are not updated at this point.
>>
>> The function range_maps_duplicates(map, size) checks whether the pages
>> have been unmerged. Since break_ksm() does perform the unmerge, this
>> function returns false, and the test passes.
>>
>> The ksm_merging_pages update happens later via the ksm_scan_thread().
>> That’s why we observe that ksm_merging_pages values are not reset
>> immediately after the test finishes.
>>
> Not familiar with ksm internal. But the ksm_merging_pages counter still has
> non-zero value when all merged pages are unmerged makes me feel odd.
>
>> If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
>> the ksm_merging_pages values are reset after the sleep.
>>
>> Once the test completes successfully, we can call ksm_unmerge(), which
>> will immediately reset the ksm_merging_pages value. This way, in the fork
>> test, the child process will also see the correct value.
>>> So which part of the story I missed?
>>>
>> So, during the cleanup phase after a successful test, we can call
>> ksm_unmerge() to reset the counter. Do you see any issue with
>> this approach?
>>
> It looks there is no issue with an extra ksm_unmerge().
>
> But one more question. Why an extra ksm_unmerge() could help.
>
> Here is what we have during test:
>
>
>    test_prot_none()
>        !range_maps_duplicates()
>        ksm_unmerge()                  1) <--- newly add
>    test_prctl_fork()
>        >--- in child
>        __mmap_and_merge_range()
>            ksm_unmerge()              2) <--- already have
>
> As you mentioned above ksm_unmerge() would immediately reset
> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages
> non-zero? And the one at 1) could help.


 From the debugging, what I understood is:

When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the
mm_struct of the process gets added to the ksm_mm_slot list. As a
result, both the parent and child processes’ mm_struct structures
will be present in ksm_mm_slot.

When KSM merges the pages, it creates a ksm_rmap_item for each page,
and the ksm_merging_pages counter is incremented accordingly.

Since the parent process did the merge, its mm_struct is present in
ksm_mm_slot, and ksm_rmap_item entries are created for all the merged
pages.

When a process is forked, the child’s mm_struct is also added to
ksm_mm_slot, and it inherits the ksm_merging_pages count. However,
no ksm_rmap_item entries are created for the child process because it
did not do any merge.

When ksm_unmerge() is called, it iterates over all processes in
ksm_mm_slot. In our case, both the parent and child are present. It
first processes the parent, which has ksm_rmap_item entries, so it
unmerges the pages and resets the ksm_merging_pages counter.

For the child, since it did not perform any actual merging, it does not
have any ksm_rmap_item entries. Therefore, there are no pages to unmerge,
and the counter remains unchanged.

So, only processes that performed KSM merging will have their counters
updated during ksm_unmerge(). The child process, having not initiated any
merging, retains the inherited counter value without any update.

So from a testing point of view, I think it is better to reset the
counters as part of the cleanup code to ensure that the next tests do
not get incorrect values.

The question I have is: is it correct to keep the inherited 
|ksm_merging_page|
value in the child or Should we reset it to 0 during |ksm_fork()|?

>
> Or there is still some timing issue like sleep(1) you did?
>
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Wei Yang 1 month, 4 weeks ago
On Thu, Aug 07, 2025 at 02:56:28PM +0530, Donet Tom wrote:
>
>On 8/6/25 8:24 PM, Wei Yang wrote:
>> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote:
>> [...]
>> > > Child process inherit the ksm_merging_pages from parent, which is reasonable
>> > > to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
>> > > for parent and leave ksm_merging_pages in child process unchanged.
>> > > 
>> > > ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
>> > > interface. I expect it applies to both parent and child.
>> > I am not very familiar with the KSM code, but from what I understand:
>> > 
>> > The ksm_merging_pages counter is maintained per mm_struct. When
>> > we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
>> > counters are updated for all mm_structs present in the ksm_mm_slot list.
>> > 
>> > A mm_struct gets added to this list  when MADV_MERGEABLE is called.
>> > In the case of the child process, since MADV_MERGEABLE has not been
>> > invoked yet, its mm_struct is not part of the list. As a result,
>> > its ksm_merging_pages counter is not reset.
>> > 
>> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma
>> I don't see it would be dropped during fork. Maybe missed.
>> 
>> > > > value remained unchanged. That’s why get_my_merging_page() in the child was
>> > > > returning a non-zero value.
>> > > > 
>> > > I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
>> > > a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
>> > > couldn't reset the value, but a ksm_unmerge() in parent could.
>> > > 
>> > > > Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>> > > > that
>> > > > resolved the problem. Later, I decided it would be cleaner to move the
>> > > > ksm_unmerge() call to the test cleanup phase.
>> > > > 
>> > > Also all the tests before test_prctl_fork(), except test_prctl(), calls
>> > > 
>> > >     ksft_test_result(!range_maps_duplicates());
>> > > 
>> > > If the previous tests succeed, it means there is no duplicate pages. This
>> > > means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
>> > > pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
>> > > proves it.)
>> > 
>> > If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
>> > which internally calls break_ksm() in the kernel. This function replaces the
>> > KSM page with an exclusive anonymous page. However, the
>> > ksm_merging_pages counters are not updated at this point.
>> > 
>> > The function range_maps_duplicates(map, size) checks whether the pages
>> > have been unmerged. Since break_ksm() does perform the unmerge, this
>> > function returns false, and the test passes.
>> > 
>> > The ksm_merging_pages update happens later via the ksm_scan_thread().
>> > That’s why we observe that ksm_merging_pages values are not reset
>> > immediately after the test finishes.
>> > 
>> Not familiar with ksm internal. But the ksm_merging_pages counter still has
>> non-zero value when all merged pages are unmerged makes me feel odd.
>> 
>> > If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
>> > the ksm_merging_pages values are reset after the sleep.
>> > 
>> > Once the test completes successfully, we can call ksm_unmerge(), which
>> > will immediately reset the ksm_merging_pages value. This way, in the fork
>> > test, the child process will also see the correct value.
>> > > So which part of the story I missed?
>> > > 
>> > So, during the cleanup phase after a successful test, we can call
>> > ksm_unmerge() to reset the counter. Do you see any issue with
>> > this approach?
>> > 
>> It looks there is no issue with an extra ksm_unmerge().
>> 
>> But one more question. Why an extra ksm_unmerge() could help.
>> 
>> Here is what we have during test:
>> 
>> 
>>    test_prot_none()
>>        !range_maps_duplicates()
>>        ksm_unmerge()                  1) <--- newly add
>>    test_prctl_fork()
>>        >--- in child
>>        __mmap_and_merge_range()
>>            ksm_unmerge()              2) <--- already have
>> 
>> As you mentioned above ksm_unmerge() would immediately reset
>> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages
>> non-zero? And the one at 1) could help.
>
>
>From the debugging, what I understood is:
>
>When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the
>mm_struct of the process gets added to the ksm_mm_slot list. As a
>result, both the parent and child processes’ mm_struct structures
>will be present in ksm_mm_slot.
>
>When KSM merges the pages, it creates a ksm_rmap_item for each page,
>and the ksm_merging_pages counter is incremented accordingly.
>
>Since the parent process did the merge, its mm_struct is present in
>ksm_mm_slot, and ksm_rmap_item entries are created for all the merged
>pages.
>
>When a process is forked, the child’s mm_struct is also added to
>ksm_mm_slot, and it inherits the ksm_merging_pages count. However,
>no ksm_rmap_item entries are created for the child process because it
>did not do any merge.
>
>When ksm_unmerge() is called, it iterates over all processes in
>ksm_mm_slot. In our case, both the parent and child are present. It
>first processes the parent, which has ksm_rmap_item entries, so it
>unmerges the pages and resets the ksm_merging_pages counter.
>
>For the child, since it did not perform any actual merging, it does not
>have any ksm_rmap_item entries. Therefore, there are no pages to unmerge,
>and the counter remains unchanged.
>

Thanks for the detailed analysis.

So the key is child has no ksm_rmap_item which will not clear ksm_merging_page
on ksm_unmerge().

>So, only processes that performed KSM merging will have their counters
>updated during ksm_unmerge(). The child process, having not initiated any
>merging, retains the inherited counter value without any update.
>
>So from a testing point of view, I think it is better to reset the
>counters as part of the cleanup code to ensure that the next tests do
>not get incorrect values.
>

Hmm... I agree from the test point of view based on current situation.

While maybe this is also a check point for later version.

>The question I have is: is it correct to keep the inherited
>|ksm_merging_page|
>value in the child or Should we reset it to 0 during |ksm_fork()|?
>

Very good question. There looks to be something wrong, but I am not sure this
is the correct way.

>> 
>> Or there is still some timing issue like sleep(1) you did?
>> 

-- 
Wei Yang
Help you, Help me
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Donet Tom 1 month, 4 weeks ago
On 8/8/25 8:28 AM, Wei Yang wrote:
> On Thu, Aug 07, 2025 at 02:56:28PM +0530, Donet Tom wrote:
>> On 8/6/25 8:24 PM, Wei Yang wrote:
>>> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote:
>>> [...]
>>>>> Child process inherit the ksm_merging_pages from parent, which is reasonable
>>>>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
>>>>> for parent and leave ksm_merging_pages in child process unchanged.
>>>>>
>>>>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
>>>>> interface. I expect it applies to both parent and child.
>>>> I am not very familiar with the KSM code, but from what I understand:
>>>>
>>>> The ksm_merging_pages counter is maintained per mm_struct. When
>>>> we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
>>>> counters are updated for all mm_structs present in the ksm_mm_slot list.
>>>>
>>>> A mm_struct gets added to this list  when MADV_MERGEABLE is called.
>>>> In the case of the child process, since MADV_MERGEABLE has not been
>>>> invoked yet, its mm_struct is not part of the list. As a result,
>>>> its ksm_merging_pages counter is not reset.
>>>>
>>> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma
>>> I don't see it would be dropped during fork. Maybe missed.
>>>
>>>>>> value remained unchanged. That’s why get_my_merging_page() in the child was
>>>>>> returning a non-zero value.
>>>>>>
>>>>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
>>>>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
>>>>> couldn't reset the value, but a ksm_unmerge() in parent could.
>>>>>
>>>>>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>>>>>> that
>>>>>> resolved the problem. Later, I decided it would be cleaner to move the
>>>>>> ksm_unmerge() call to the test cleanup phase.
>>>>>>
>>>>> Also all the tests before test_prctl_fork(), except test_prctl(), calls
>>>>>
>>>>>      ksft_test_result(!range_maps_duplicates());
>>>>>
>>>>> If the previous tests succeed, it means there is no duplicate pages. This
>>>>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
>>>>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
>>>>> proves it.)
>>>> If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
>>>> which internally calls break_ksm() in the kernel. This function replaces the
>>>> KSM page with an exclusive anonymous page. However, the
>>>> ksm_merging_pages counters are not updated at this point.
>>>>
>>>> The function range_maps_duplicates(map, size) checks whether the pages
>>>> have been unmerged. Since break_ksm() does perform the unmerge, this
>>>> function returns false, and the test passes.
>>>>
>>>> The ksm_merging_pages update happens later via the ksm_scan_thread().
>>>> That’s why we observe that ksm_merging_pages values are not reset
>>>> immediately after the test finishes.
>>>>
>>> Not familiar with ksm internal. But the ksm_merging_pages counter still has
>>> non-zero value when all merged pages are unmerged makes me feel odd.
>>>
>>>> If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
>>>> the ksm_merging_pages values are reset after the sleep.
>>>>
>>>> Once the test completes successfully, we can call ksm_unmerge(), which
>>>> will immediately reset the ksm_merging_pages value. This way, in the fork
>>>> test, the child process will also see the correct value.
>>>>> So which part of the story I missed?
>>>>>
>>>> So, during the cleanup phase after a successful test, we can call
>>>> ksm_unmerge() to reset the counter. Do you see any issue with
>>>> this approach?
>>>>
>>> It looks there is no issue with an extra ksm_unmerge().
>>>
>>> But one more question. Why an extra ksm_unmerge() could help.
>>>
>>> Here is what we have during test:
>>>
>>>
>>>     test_prot_none()
>>>         !range_maps_duplicates()
>>>         ksm_unmerge()                  1) <--- newly add
>>>     test_prctl_fork()
>>>         >--- in child
>>>         __mmap_and_merge_range()
>>>             ksm_unmerge()              2) <--- already have
>>>
>>> As you mentioned above ksm_unmerge() would immediately reset
>>> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages
>>> non-zero? And the one at 1) could help.
>>
> >From the debugging, what I understood is:
>> When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the
>> mm_struct of the process gets added to the ksm_mm_slot list. As a
>> result, both the parent and child processes’ mm_struct structures
>> will be present in ksm_mm_slot.
>>
>> When KSM merges the pages, it creates a ksm_rmap_item for each page,
>> and the ksm_merging_pages counter is incremented accordingly.
>>
>> Since the parent process did the merge, its mm_struct is present in
>> ksm_mm_slot, and ksm_rmap_item entries are created for all the merged
>> pages.
>>
>> When a process is forked, the child’s mm_struct is also added to
>> ksm_mm_slot, and it inherits the ksm_merging_pages count. However,
>> no ksm_rmap_item entries are created for the child process because it
>> did not do any merge.
>>
>> When ksm_unmerge() is called, it iterates over all processes in
>> ksm_mm_slot. In our case, both the parent and child are present. It
>> first processes the parent, which has ksm_rmap_item entries, so it
>> unmerges the pages and resets the ksm_merging_pages counter.
>>
>> For the child, since it did not perform any actual merging, it does not
>> have any ksm_rmap_item entries. Therefore, there are no pages to unmerge,
>> and the counter remains unchanged.
>>
> Thanks for the detailed analysis.
>
> So the key is child has no ksm_rmap_item which will not clear ksm_merging_page
> on ksm_unmerge().
>
>> So, only processes that performed KSM merging will have their counters
>> updated during ksm_unmerge(). The child process, having not initiated any
>> merging, retains the inherited counter value without any update.
>>
>> So from a testing point of view, I think it is better to reset the
>> counters as part of the cleanup code to ensure that the next tests do
>> not get incorrect values.
>>
> Hmm... I agree from the test point of view based on current situation.
>
> While maybe this is also a check point for later version.

Are you okay to proceed with the current patch in this series?

>
>> The question I have is: is it correct to keep the inherited
>> |ksm_merging_page|
>> value in the child or Should we reset it to 0 during |ksm_fork()|?
>>
> Very good question. There looks to be something wrong, but I am not sure this
> is the correct way.

ok.

I am going through it and will come up with a fix along with a test for 
this scenario. I will post it as a separate series.


>
>>> Or there is still some timing issue like sleep(1) you did?
>>>
Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Posted by Wei Yang 1 month, 3 weeks ago
On Fri, Aug 08, 2025 at 07:55:37PM +0530, Donet Tom wrote:
[...]
>> Thanks for the detailed analysis.
>> 
>> So the key is child has no ksm_rmap_item which will not clear ksm_merging_page
>> on ksm_unmerge().
>> 
>> > So, only processes that performed KSM merging will have their counters
>> > updated during ksm_unmerge(). The child process, having not initiated any
>> > merging, retains the inherited counter value without any update.
>> > 
>> > So from a testing point of view, I think it is better to reset the
>> > counters as part of the cleanup code to ensure that the next tests do
>> > not get incorrect values.
>> > 
>> Hmm... I agree from the test point of view based on current situation.
>> 
>> While maybe this is also a check point for later version.
>
>Are you okay to proceed with the current patch in this series?
>

Sure.


-- 
Wei Yang
Help you, Help me