mm/ksm.c | 246 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 246 insertions(+)
When a hardware memory error occurs on a KSM page, the current
behavior is to kill all processes mapping that page. This can
be overly aggressive when KSM has multiple duplicate pages in
a chain where other duplicates are still healthy.
This patch introduces a recovery mechanism that attempts to
migrate mappings from the failing KSM page to a newly
allocated KSM page or another healthy duplicate already
present in the same chain, before falling back to the
process-killing procedure.
The recovery process works as follows:
1. Identify if the failing KSM page belongs to a stable node chain.
2. Locate a healthy duplicate KSM page within the same chain.
3. For each process mapping the failing page:
a. Attempt to allocate a new KSM page copy from healthy duplicate
KSM page. If successful, migrate the mapping to this new KSM page.
b. If allocation fails, migrate the mapping to the existing healthy
duplicate KSM page.
4. If all migrations succeed, remove the failing KSM page from the chain.
5. Only if recovery fails (e.g., no healthy duplicate found or migration
error) does the kernel fall back to killing the affected processes.
The original idea came from Naoya Horiguchi.
https://lore.kernel.org/all/20230331054243.GB1435482@hori.linux.bs1.fc.nec.co.jp/
I test it with einj in physical machine x86_64 CPU Intel(R) Xeon(R) Gold 6430.
test shell script
modprobe einj 2>/dev/null
echo 0x10 > /sys/kernel/debug/apei/einj/error_type
echo $ADDRESS > /sys/kernel/debug/apei/einj/param1
echo 0xfffffffffffff000 > /sys/kernel/debug/apei/einj/param2
echo 1 > /sys/kernel/debug/apei/einj/error_inject
FIRST WAY: allocate a new KSM page copy from healthy duplicate
1. alloc 1024 page with same content and enable KSM to merge
after merge (same phy_addr only print once)
virtual addr = 0x71582be00000 phy_addr =0x124802000
virtual addr = 0x71582bf2c000 phy_addr =0x124902000
virtual addr = 0x71582c026000 phy_addr =0x125402000
virtual addr = 0x71582c120000 phy_addr =0x125502000
2. echo 0x124802000 > /sys/kernel/debug/apei/einj/param1
virtual addr = 0x71582be00000 phy_addr =0x1363b1000 (new allocated)
virtual addr = 0x71582bf2c000 phy_addr =0x124902000
virtual addr = 0x71582c026000 phy_addr =0x125402000
virtual addr = 0x71582c120000 phy_addr =0x125502000
3. echo 0x124902000 > /sys/kernel/debug/apei/einj/param1
virtual addr = 0x71582be00000 phy_addr =0x1363b1000
virtual addr = 0x71582bf2c000 phy_addr =0x13099a000 (new allocated)
virtual addr = 0x71582c026000 phy_addr =0x125402000
virtual addr = 0x71582c120000 phy_addr =0x125502000
kernel-log:
mce: [Hardware Error]: Machine check events logged
ksm: recovery successful, no need to kill processes
Memory failure: 0x124802: recovery action for dirty LRU page: Recovered
Memory failure: 0x124802: recovery action for already poisoned page: Failed
ksm: recovery successful, no need to kill processes
Memory failure: 0x124902: recovery action for dirty LRU page: Recovered
Memory failure: 0x124902: recovery action for already poisoned page: Failed
SECOND WAY: Migrate the mapping to the existing healthy duplicate KSM page
1. alloc 1024 page with same content and enable KSM to merge
after merge (same phy_addr only print once)
virtual addr = 0x79a172000000 phy_addr =0x141802000
virtual addr = 0x79a17212c000 phy_addr =0x141902000
virtual addr = 0x79a172226000 phy_addr =0x13cc02000
virtual addr = 0x79a172320000 phy_addr =0x13cd02000
2 echo 0x141802000 > /sys/kernel/debug/apei/einj/param1
a.virtual addr = 0x79a172000000 phy_addr =0x13cd02000
b.virtual addr = 0x79a17212c000 phy_addr =0x141902000
c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000
d.virtual addr = 0x79a172320000 phy_addr =0x13cd02000 (share with a)
3.echo 0x141902000 > /sys/kernel/debug/apei/einj/param1
a.virtual addr = 0x79a172000000 phy_addr =0x13cd02000
b.virtual addr = 0x79a172032000 phy_addr =0x13cd02000 (share with a)
c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000
d.virtual addr = 0x79a172320000 phy_addr =0x13cd02000 (share with a)
4. echo 0x13cd02000 > /sys/kernel/debug/apei/einj/param1
a.virtual addr = 0x79a172000000 phy_addr =0x13cc02000
b.virtual addr = 0x79a172032000 phy_addr =0x13cc02000 (share with a)
c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 (share with a)
d.virtual addr = 0x79a172320000 phy_addr =0x13cc02000 (share with a)
5. echo 0x13cc02000 > /sys/kernel/debug/apei/einj/param1
Bus error (core dumped)
kernel-log:
mce: [Hardware Error]: Machine check events logged
ksm: recovery successful, no need to kill processes
Memory failure: 0x141802: recovery action for dirty LRU page: Recovered
Memory failure: 0x141802: recovery action for already poisoned page: Failed
ksm: recovery successful, no need to kill processes
Memory failure: 0x141902: recovery action for dirty LRU page: Recovered
Memory failure: 0x141902: recovery action for already poisoned page: Failed
ksm: recovery successful, no need to kill processes
Memory failure: 0x13cd02: recovery action for dirty LRU page: Recovered
Memory failure: 0x13cd02: recovery action for already poisoned page: Failed
Memory failure: 0x13cc02: recovery action for dirty LRU page: Recovered
Memory failure: 0x13cc02: recovery action for already poisoned page: Failed
MCE: Killing ksm_addr:5221 due to hardware memory corruption fault at 79a172000000
ZERO PAGE TEST:
when I test in physical machine x86_64 CPU Intel(R) Xeon(R) Gold 6430
[shell]# ./einj.sh 0x193f908000
./einj.sh: line 25: echo: write error: Address already in use
when I test in qemu-x86_64.
Injecting memory failure at pfn 0x3a9d0c
Memory failure: 0x3a9d0c: unhandlable page.
Memory failure: 0x3a9d0c: recovery action for get hwpoison page: Ignored
It seems return early before enter this patch's functions.
Thanks for review and comments!
Changes in v2:
- Implemented a two-tier recovery strategy: preferring newly allocated
pages over existing duplicates to avoid concentrating mappings on a
single page suggested by David Hildenbrand
- Remove handling of the zeropage in replace_failing_page(), as it is
non-recoverable suggested by Lance Yang
- Correct the locking order by acquiring the mmap_lock before the page
lock during page replacement, suggested by Miaohe Lin
- Add protection using the ksm_thread_mutex around the entire recovery
operation to prevent race conditions with concurrent KSM scanning
- Separated the logic into smaller, more focused functions for better
maintainability
- Update patch title
Longlong Xia (1):
mm/ksm: recover from memory failure on KSM page by migrating to
healthy duplicate
mm/ksm.c | 246 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 246 insertions(+)
--
2.43.0
> When a hardware memory error occurs on a KSM page, the current > behavior is to kill all processes mapping that page. This can … * The word wrapping can be improved another bit, can't it? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.17#n658 * How relevant is such a cover letter for a single patch? * Is there a need to extend change descriptions at other places? Regards, Markus
On 16.10.25 12:18, Longlong Xia wrote: > When a hardware memory error occurs on a KSM page, the current > behavior is to kill all processes mapping that page. This can > be overly aggressive when KSM has multiple duplicate pages in > a chain where other duplicates are still healthy. > > This patch introduces a recovery mechanism that attempts to > migrate mappings from the failing KSM page to a newly > allocated KSM page or another healthy duplicate already > present in the same chain, before falling back to the > process-killing procedure. > > The recovery process works as follows: > 1. Identify if the failing KSM page belongs to a stable node chain. > 2. Locate a healthy duplicate KSM page within the same chain. > 3. For each process mapping the failing page: > a. Attempt to allocate a new KSM page copy from healthy duplicate > KSM page. If successful, migrate the mapping to this new KSM page. > b. If allocation fails, migrate the mapping to the existing healthy > duplicate KSM page. > 4. If all migrations succeed, remove the failing KSM page from the chain. > 5. Only if recovery fails (e.g., no healthy duplicate found or migration > error) does the kernel fall back to killing the affected processes. > > The original idea came from Naoya Horiguchi. > https://lore.kernel.org/all/20230331054243.GB1435482@hori.linux.bs1.fc.nec.co.jp/ > > I test it with einj in physical machine x86_64 CPU Intel(R) Xeon(R) Gold 6430. > > test shell script > modprobe einj 2>/dev/null > echo 0x10 > /sys/kernel/debug/apei/einj/error_type > echo $ADDRESS > /sys/kernel/debug/apei/einj/param1 > echo 0xfffffffffffff000 > /sys/kernel/debug/apei/einj/param2 > echo 1 > /sys/kernel/debug/apei/einj/error_inject > > FIRST WAY: allocate a new KSM page copy from healthy duplicate > 1. alloc 1024 page with same content and enable KSM to merge > after merge (same phy_addr only print once) > virtual addr = 0x71582be00000 phy_addr =0x124802000 > virtual addr = 0x71582bf2c000 phy_addr =0x124902000 > virtual addr = 0x71582c026000 phy_addr =0x125402000 > virtual addr = 0x71582c120000 phy_addr =0x125502000 > > > 2. echo 0x124802000 > /sys/kernel/debug/apei/einj/param1 > virtual addr = 0x71582be00000 phy_addr =0x1363b1000 (new allocated) > virtual addr = 0x71582bf2c000 phy_addr =0x124902000 > virtual addr = 0x71582c026000 phy_addr =0x125402000 > virtual addr = 0x71582c120000 phy_addr =0x125502000 > > > 3. echo 0x124902000 > /sys/kernel/debug/apei/einj/param1 > virtual addr = 0x71582be00000 phy_addr =0x1363b1000 > virtual addr = 0x71582bf2c000 phy_addr =0x13099a000 (new allocated) > virtual addr = 0x71582c026000 phy_addr =0x125402000 > virtual addr = 0x71582c120000 phy_addr =0x125502000 > > kernel-log: > mce: [Hardware Error]: Machine check events logged > ksm: recovery successful, no need to kill processes > Memory failure: 0x124802: recovery action for dirty LRU page: Recovered > Memory failure: 0x124802: recovery action for already poisoned page: Failed > ksm: recovery successful, no need to kill processes > Memory failure: 0x124902: recovery action for dirty LRU page: Recovered > Memory failure: 0x124902: recovery action for already poisoned page: Failed > > > SECOND WAY: Migrate the mapping to the existing healthy duplicate KSM page > 1. alloc 1024 page with same content and enable KSM to merge > after merge (same phy_addr only print once) > virtual addr = 0x79a172000000 phy_addr =0x141802000 > virtual addr = 0x79a17212c000 phy_addr =0x141902000 > virtual addr = 0x79a172226000 phy_addr =0x13cc02000 > virtual addr = 0x79a172320000 phy_addr =0x13cd02000 > > 2 echo 0x141802000 > /sys/kernel/debug/apei/einj/param1 > a.virtual addr = 0x79a172000000 phy_addr =0x13cd02000 > b.virtual addr = 0x79a17212c000 phy_addr =0x141902000 > c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 > d.virtual addr = 0x79a172320000 phy_addr =0x13cd02000 (share with a) > > 3.echo 0x141902000 > /sys/kernel/debug/apei/einj/param1 > a.virtual addr = 0x79a172000000 phy_addr =0x13cd02000 > b.virtual addr = 0x79a172032000 phy_addr =0x13cd02000 (share with a) > c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 > d.virtual addr = 0x79a172320000 phy_addr =0x13cd02000 (share with a) > > 4. echo 0x13cd02000 > /sys/kernel/debug/apei/einj/param1 > a.virtual addr = 0x79a172000000 phy_addr =0x13cc02000 > b.virtual addr = 0x79a172032000 phy_addr =0x13cc02000 (share with a) > c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 (share with a) > d.virtual addr = 0x79a172320000 phy_addr =0x13cc02000 (share with a) > > 5. echo 0x13cc02000 > /sys/kernel/debug/apei/einj/param1 > Bus error (core dumped) > > kernel-log: > mce: [Hardware Error]: Machine check events logged > ksm: recovery successful, no need to kill processes > Memory failure: 0x141802: recovery action for dirty LRU page: Recovered > Memory failure: 0x141802: recovery action for already poisoned page: Failed > ksm: recovery successful, no need to kill processes > Memory failure: 0x141902: recovery action for dirty LRU page: Recovered > Memory failure: 0x141902: recovery action for already poisoned page: Failed > ksm: recovery successful, no need to kill processes > Memory failure: 0x13cd02: recovery action for dirty LRU page: Recovered > Memory failure: 0x13cd02: recovery action for already poisoned page: Failed > Memory failure: 0x13cc02: recovery action for dirty LRU page: Recovered > Memory failure: 0x13cc02: recovery action for already poisoned page: Failed > MCE: Killing ksm_addr:5221 due to hardware memory corruption fault at 79a172000000 > > ZERO PAGE TEST: > when I test in physical machine x86_64 CPU Intel(R) Xeon(R) Gold 6430 > [shell]# ./einj.sh 0x193f908000 > ./einj.sh: line 25: echo: write error: Address already in use > > when I test in qemu-x86_64. > Injecting memory failure at pfn 0x3a9d0c > Memory failure: 0x3a9d0c: unhandlable page. > Memory failure: 0x3a9d0c: recovery action for get hwpoison page: Ignored > > It seems return early before enter this patch's functions. > > Thanks for review and comments! > > Changes in v2: > > - Implemented a two-tier recovery strategy: preferring newly allocated > pages over existing duplicates to avoid concentrating mappings on a > single page suggested by David Hildenbrand I also asked how relevant this is in practice [1] " But how realistic do we consider that in practice? We need quite a bunch of processes to dedup the same page to end up getting duplicates in the chain IIRC. So isn't this rather an improvement only for less likely scenarios in practice? " In particular for your test "alloc 1024 page with same content". It certainly adds complexity, so we should clarify if this is really worth it. [1] https://lore.kernel.org/all/8c4d8ebe-885e-40f0-a10e-7290067c7b96@redhat.com/ -- Cheers David / dhildenb
Thanks for the reply. I do some simple tests. I hope these findings are helpful for the community's review. 1.Test VM Configuration Hardware: x86_64 QEMU VM, 1 vCPU, 256MB RAM per guest Kernel: 6.6.89 Testcase1: Single VM and enable KSM - VM Memory Usage: * RSS Total = 275028 KB (268 MB) * RSS Anon = 253656 KB (247 MB) * RSS File = 21372 KB (20 MB) * RSS Shmem = 0 KB (0 MB) a.Traverse the stable tree b. pages on the chain 2 chains detected Chain #1: 51 duplicates, 12,956 pages (~51 MB) Chain #2: 15 duplicates, 3,822 pages (~15 MB) Average: 8,389 pages per chain Sum: 16778 pages (64.6% of ksm_pages_sharing + ksm_pages_shared) c. pages on the chain Non-chain pages: 9,209 pages d.chain_count = 2, not_chain_count = 4200 e. /sys/kernel/mm/ksm/ksm_pages_sharing = 21721 /sys/kernel/mm/ksm/ksm_pages_shared = 4266 /sys/kernel/mm/ksm/ksm_pages_unshared = 38098 Testcase2: 10 VMs and enable KSM a.Traverse the stable tree b.Pages on the chain 8 chains detected Chain #1: 458 duplicates, 117,012 pages (~457 MB) Chain #2: 150 duplicates, 38,231 pages (~149 MB) Chain #3: 10 duplicates, 2,320 pages (~9 MB) Chain #4: 8 duplicates, 1,814 pages (~7 MB) Chain #5-8: 4, 3, 3, 2 duplicates (920, 720, 600, 260 pages) Average: 20,235 pages per chain Sum: 161877 pages (44.5% of ksm_pages_sharing + ksm_pages_shared) c.Pages on the chain Non-chain pages: 201,486 d.chain_count = 8, not_chain_count = 15936 e. /sys/kernel/mm/ksm/ksm_pages_sharing = 346789 /sys/kernel/mm/ksm/ksm_pages_shared = 16574 /sys/kernel/mm/ksm/ksm_pages_unshared = 264918 2.Test firefox browser I open 10 Firefox browser windows, perform random searches, and then enable KSM a.page_not_chain = 4043 b.chain_pages = 936 (18.8% of ksm_pages_sharing + ksm_pages_shared) c.chain_count = 2, not_chain_count = 424 d. /sys/kernel/mm/ksm/ksm_pages_sharing = 4554 /sys/kernel/mm/ksm/ksm_pages_shared = 425 /sys/kernel/mm/ksm/ksm_pages_unshared = 18461 Surprisingly, although chains are few in number, they contribute significantly to the overall savings. In the 10-VM scenario, only 8 chains produce 161,877 pages (44.5% of total), while thousands of non-chain groups contribute the remaining 55.5%. I would appreciate any feedback or suggestions. Best regards, Longlong Xia 在 2025/10/16 18:46, David Hildenbrand 写道: > On 16.10.25 12:18, Longlong Xia wrote: >> When a hardware memory error occurs on a KSM page, the current >> behavior is to kill all processes mapping that page. This can >> be overly aggressive when KSM has multiple duplicate pages in >> a chain where other duplicates are still healthy. >> >> This patch introduces a recovery mechanism that attempts to >> migrate mappings from the failing KSM page to a newly >> allocated KSM page or another healthy duplicate already >> present in the same chain, before falling back to the >> process-killing procedure. >> >> The recovery process works as follows: >> 1. Identify if the failing KSM page belongs to a stable node chain. >> 2. Locate a healthy duplicate KSM page within the same chain. >> 3. For each process mapping the failing page: >> a. Attempt to allocate a new KSM page copy from healthy duplicate >> KSM page. If successful, migrate the mapping to this new KSM >> page. >> b. If allocation fails, migrate the mapping to the existing healthy >> duplicate KSM page. >> 4. If all migrations succeed, remove the failing KSM page from the >> chain. >> 5. Only if recovery fails (e.g., no healthy duplicate found or migration >> error) does the kernel fall back to killing the affected processes. >> >> The original idea came from Naoya Horiguchi. >> https://lore.kernel.org/all/20230331054243.GB1435482@hori.linux.bs1.fc.nec.co.jp/ >> >> >> I test it with einj in physical machine x86_64 CPU Intel(R) Xeon(R) >> Gold 6430. >> >> test shell script >> modprobe einj 2>/dev/null >> echo 0x10 > /sys/kernel/debug/apei/einj/error_type >> echo $ADDRESS > /sys/kernel/debug/apei/einj/param1 >> echo 0xfffffffffffff000 > /sys/kernel/debug/apei/einj/param2 >> echo 1 > /sys/kernel/debug/apei/einj/error_inject >> >> FIRST WAY: allocate a new KSM page copy from healthy duplicate >> 1. alloc 1024 page with same content and enable KSM to merge >> after merge (same phy_addr only print once) >> virtual addr = 0x71582be00000 phy_addr =0x124802000 >> virtual addr = 0x71582bf2c000 phy_addr =0x124902000 >> virtual addr = 0x71582c026000 phy_addr =0x125402000 >> virtual addr = 0x71582c120000 phy_addr =0x125502000 >> >> >> 2. echo 0x124802000 > /sys/kernel/debug/apei/einj/param1 >> virtual addr = 0x71582be00000 phy_addr =0x1363b1000 (new allocated) >> virtual addr = 0x71582bf2c000 phy_addr =0x124902000 >> virtual addr = 0x71582c026000 phy_addr =0x125402000 >> virtual addr = 0x71582c120000 phy_addr =0x125502000 >> >> >> 3. echo 0x124902000 > /sys/kernel/debug/apei/einj/param1 >> virtual addr = 0x71582be00000 phy_addr =0x1363b1000 >> virtual addr = 0x71582bf2c000 phy_addr =0x13099a000 (new allocated) >> virtual addr = 0x71582c026000 phy_addr =0x125402000 >> virtual addr = 0x71582c120000 phy_addr =0x125502000 >> >> kernel-log: >> mce: [Hardware Error]: Machine check events logged >> ksm: recovery successful, no need to kill processes >> Memory failure: 0x124802: recovery action for dirty LRU page: Recovered >> Memory failure: 0x124802: recovery action for already poisoned page: >> Failed >> ksm: recovery successful, no need to kill processes >> Memory failure: 0x124902: recovery action for dirty LRU page: Recovered >> Memory failure: 0x124902: recovery action for already poisoned page: >> Failed >> >> >> SECOND WAY: Migrate the mapping to the existing healthy duplicate KSM >> page >> 1. alloc 1024 page with same content and enable KSM to merge >> after merge (same phy_addr only print once) >> virtual addr = 0x79a172000000 phy_addr =0x141802000 >> virtual addr = 0x79a17212c000 phy_addr =0x141902000 >> virtual addr = 0x79a172226000 phy_addr =0x13cc02000 >> virtual addr = 0x79a172320000 phy_addr =0x13cd02000 >> >> 2 echo 0x141802000 > /sys/kernel/debug/apei/einj/param1 >> a.virtual addr = 0x79a172000000 phy_addr =0x13cd02000 >> b.virtual addr = 0x79a17212c000 phy_addr =0x141902000 >> c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 >> d.virtual addr = 0x79a172320000 phy_addr =0x13cd02000 (share with a) >> >> 3.echo 0x141902000 > /sys/kernel/debug/apei/einj/param1 >> a.virtual addr = 0x79a172000000 phy_addr =0x13cd02000 >> b.virtual addr = 0x79a172032000 phy_addr =0x13cd02000 (share with a) >> c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 >> d.virtual addr = 0x79a172320000 phy_addr =0x13cd02000 (share with a) >> >> 4. echo 0x13cd02000 > /sys/kernel/debug/apei/einj/param1 >> a.virtual addr = 0x79a172000000 phy_addr =0x13cc02000 >> b.virtual addr = 0x79a172032000 phy_addr =0x13cc02000 (share with a) >> c.virtual addr = 0x79a172226000 phy_addr =0x13cc02000 (share with a) >> d.virtual addr = 0x79a172320000 phy_addr =0x13cc02000 (share with a) >> >> 5. echo 0x13cc02000 > /sys/kernel/debug/apei/einj/param1 >> Bus error (core dumped) >> >> kernel-log: >> mce: [Hardware Error]: Machine check events logged >> ksm: recovery successful, no need to kill processes >> Memory failure: 0x141802: recovery action for dirty LRU page: Recovered >> Memory failure: 0x141802: recovery action for already poisoned page: >> Failed >> ksm: recovery successful, no need to kill processes >> Memory failure: 0x141902: recovery action for dirty LRU page: Recovered >> Memory failure: 0x141902: recovery action for already poisoned page: >> Failed >> ksm: recovery successful, no need to kill processes >> Memory failure: 0x13cd02: recovery action for dirty LRU page: Recovered >> Memory failure: 0x13cd02: recovery action for already poisoned page: >> Failed >> Memory failure: 0x13cc02: recovery action for dirty LRU page: Recovered >> Memory failure: 0x13cc02: recovery action for already poisoned page: >> Failed >> MCE: Killing ksm_addr:5221 due to hardware memory corruption fault at >> 79a172000000 >> >> ZERO PAGE TEST: >> when I test in physical machine x86_64 CPU Intel(R) Xeon(R) Gold 6430 >> [shell]# ./einj.sh 0x193f908000 >> ./einj.sh: line 25: echo: write error: Address already in use >> >> when I test in qemu-x86_64. >> Injecting memory failure at pfn 0x3a9d0c >> Memory failure: 0x3a9d0c: unhandlable page. >> Memory failure: 0x3a9d0c: recovery action for get hwpoison page: Ignored >> >> It seems return early before enter this patch's functions. >> >> Thanks for review and comments! >> >> Changes in v2: >> >> - Implemented a two-tier recovery strategy: preferring newly allocated >> pages over existing duplicates to avoid concentrating mappings on a >> single page suggested by David Hildenbrand > > I also asked how relevant this is in practice [1] > > " > But how realistic do we consider that in practice? We need quite a bunch > of processes to dedup the same page to end up getting duplicates in the > chain IIRC. > > So isn't this rather an improvement only for less likely scenarios in > practice? > " > > In particular for your test "alloc 1024 page with same content". > > It certainly adds complexity, so we should clarify if this is really > worth it. > > [1] > https://lore.kernel.org/all/8c4d8ebe-885e-40f0-a10e-7290067c7b96@redhat.com/ >
On 21.10.25 16:00, Long long Xia wrote: > Thanks for the reply. > > I do some simple tests. > I hope these findings are helpful for the community's review. > > 1.Test VM > Configuration > Hardware: x86_64 QEMU VM, 1 vCPU, 256MB RAM per guest > Kernel: 6.6.89 > > Testcase1: Single VM and enable KSM > > - VM Memory Usage: > * RSS Total = 275028 KB (268 MB) > * RSS Anon = 253656 KB (247 MB) > * RSS File = 21372 KB (20 MB) > * RSS Shmem = 0 KB (0 MB) > > a.Traverse the stable tree > b. pages on the chain > 2 chains detected > Chain #1: 51 duplicates, 12,956 pages (~51 MB) > Chain #2: 15 duplicates, 3,822 pages (~15 MB) > Average: 8,389 pages per chain > Sum: 16778 pages (64.6% of ksm_pages_sharing + ksm_pages_shared) > c. pages on the chain > Non-chain pages: 9,209 pages > d.chain_count = 2, not_chain_count = 4200 > e. > /sys/kernel/mm/ksm/ksm_pages_sharing = 21721 > /sys/kernel/mm/ksm/ksm_pages_shared = 4266 > /sys/kernel/mm/ksm/ksm_pages_unshared = 38098 > > > Testcase2: 10 VMs and enable KSM > a.Traverse the stable tree > b.Pages on the chain > 8 chains detected > Chain #1: 458 duplicates, 117,012 pages (~457 MB) > Chain #2: 150 duplicates, 38,231 pages (~149 MB) > Chain #3: 10 duplicates, 2,320 pages (~9 MB) > Chain #4: 8 duplicates, 1,814 pages (~7 MB) > Chain #5-8: 4, 3, 3, 2 duplicates (920, 720, 600, 260 pages) Thanks, so I assume the top candidates is mostly zeropages and stuff like that. Makes sense to me then, it would be great to add that as motivation to the cover letter! -- Cheers David / dhildenb
© 2016 - 2026 Red Hat, Inc.